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FOREWORD 


The idea of the "management sciences” is as old as science itself; it is simply 
the idea that man may apply his highly refined methods of gaining knowledge 
to the management of his own affairs, be they entrepreneurial, governmental, 
educational, or whatever. 

This idea crops up in Western thought in Plato and Aristotle, later in the stoic 
philosophy, later in many medieval and renaissance writings, and in the many 
facets of economic and social philosophy of the nineteenth century. 

In our own century there has been quite a surge forward in exploring the idea 
of the management sciences, and for good reasons. Everyone knows that our 
social systems have become much more complicated than any before; the 
"manager” of today is faced with problems of such magnitude and seriousness 
that he can no longer rely on good common sense or flashes of insight to solve 
them. Many managers of even a decade ago were proud of the fact that they 
"flew by the seat of their pants” and that their companies or government agencies 
"grew like Topsy. ” Nowadays we realize that a manager who flies by the seat 
of his pants is apt to have the future hit him in the same place, and that the new 
Topsies are monsters no Frankenstein could have imagined. 

In an attempt to meet the challenges of the modern technological world, a 
number of professional societies and informal groups were started after World 
War II: information scientists, systems engineers, general systems scientists, 
control scientists, behavioral scientists. Of these groups, one had a very broad 
and specific objective: to establish a profession of scientists and engineers whose 
mission is to improve large, complex systems of all kinds. This group goes under 
the label of "operations research. ” 

The professional interest of operations research is matched by the scientific 
and philosophical interest of another group with equally broad interests. In 
1954 The Institute of Management Sciences was formed to identify, extend, and 
unify scientific knowledge that contributes to the understanding and practice of 
management. Thus, the idea behind the Institute was to create a union of many 
groups of managers, engineers, scientists, and others who have a common interest 
in understanding and improving man’s environment as well as man himself. 
Whereas the cohesion of operations research lies in its development of a dis¬ 
cipline and a profession, the unity of the management sciences lies in a common 
aim. In the end, this common aim of many different parties may merge with the 
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aim of a coherent discipline; it was with just such an idealist hope in mind that 
the founders of the new Institute of Management Sciences called the journal 
Management Science —i.e., used the singular "science” for the journal, the plural 
"sciences” for the Institute that was combining such a diversity of interests. 

There are many ways to learn about the management sciences. Books and 
articles on the subject are appearing in increasing numbers. Unfortunately, too 
many of these are simply attempts to "sell” the managers on the idea of hiring 
professional scientists. The "unity” of interests turns out to be economic, which 
of course is all right, but not the whole picture. Indeed, there is no reason at all 
why the industrial manager shouldn’t "sell” the scientist the idea of the scientist’s 
working for the manager. Or why the government executive shouldn’t "tell ” the 
scientist about some of the realities of government life. Or why a manager 
shouldn’t apply some of his knowledge of management to the management of 
science itself. In a successful Institute of Management Sciences the conversations 
will go in all directions with equal weight and authority. 

In order to appreciate the real idea behind the management sciences, one 
should read the material in its original form, as people struggle to express 
their ideas and to establish a sound basis for their positions. This is a way that 
may be difficult to follow, but far more rewarding in the end. 

This volume of papers selected from the first decade of Management Science— 
together with its companion volume, Executive Readings in Management Science — 
provides just such an opportunity. The editors have wisely decided to divide 
the papers into two volumes, one dealing with fundamental scientific work of 
applied mathematicians, the other with work in various scientific disciplines, 
as well as contributions from managers themselves. The reader should be able 
to acquire from these two volumes a real flavor of the current topics of conver¬ 
sation that make up the management sciences of today. He should also be able 
to sense those topics that are not being discussed but should be: the subtle 
problems of human values or the more complex problems of large systems like 
education, water, urban living, etc. He will come to realize that in this effort to 
develop a living conversation about the deepest problems we humans face, too 
many people are remaining silent, absorbed in their own little enterprises. A 
unity of the management sciences will occur only if everyone begins to speak. 


C. West Churchman 



PREFACE 


In 1961 the Institute of Management Sciences, at the suggestion of The 
Macmillan Company, decided to publish two volumes of selected reprints of 
articles from Management Science. The purpose of this project was to make some 
of the significant work of contributors to the journal more readily available. 
One volume was intended for executives; the other was intended for mathe¬ 
matically oriented specialists in the management sciences. 

During the latter half of 1962 Martin K. Starr and I were appointed editors of 
the projected anthologies. In the next year we each read those of the papers in 
Management Science that seemed respectively appropriate, and then we selected 
papers for inclusion in the volumes. The articles in this book were chosen from 
the first eight volumes of Management Science. The papers in Mr. Starr's book, 
Executive Readings in Management Science , were selected from the first nine 
volumes of Management Science. Our selections were reviewed by the editors of 
Management Science and approved by the editor-in-chief. This approval should 
not be construed as an official statement by the Institute of Management 
Sciences concerning the relative merits of papers included and excluded from 
the volumes. 


The papers in this volume are limited to those that contribute original re¬ 
search in mathematical aspects of the management sciences. Surveys and ex¬ 
pository papers are specifically excluded. Even with this restriction I found that 
many fine papers could not be included for lack of space. However, I believe 
that the papers reprinted here are fairly representative and are of uniformly 
high quality. 

An attempt has been made to correct any substantive or typographical errors 
that appeared in the original published papers. Where major changes were re¬ 
quired in the original text to correct errors or to accommodate limitations of 
space, this fact is noted at the end of the reprinted version. Minor corrections 
are not noted. 

The articles in this volume have been arranged according to similarity of 
subject matter rather than according to chronology. The papers are divided 
into two major parts, the first dealing with deterministic decision models and 
the second dealing with stochastic decision models. Each part is preceded by a 
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PREFACE 


commentary in which certain relationships among the papers in this volume 
(and elsewhere) are described. 

A few of the papers deal with topics that are not discussed by other papers in 
this collection. As a result I have not always found it convenient to mention 
such papers in the commentary. This silence should of course not be interpreted 
as reflecting any judgment on my part concerning the quality of these papers. 

I am indebted to C. West Churchman for his guidance in the ini tial planning 
of these volumes; to R. M. Thrall, editor-in-chief of Management Science, for 
his constant support; to the editors of Management Science, especially William 
W. Cooper, Murray A. Geisler, and Morton Klein, for their comments and 
suggestions; to the department of industrial engineering of Stanford University 
for providing secretarial support; to Martin K. Starr for his excellent coopera¬ 
tion; and to my wife for typing and preparing the final manuscript for publica¬ 
tion, as well as for her constant encouragement. 


Arthur F. Veinott, Jr. 
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AN INTRODUCTION 


This volume is a collection of re¬ 
prints of some of the major research 
papers published in the first eight 
volumes of Management Science , the 
journal of the Institute of Manage¬ 
ment Sciences. Almost all of the 
papers represented are concerned with 
the problem faced by a decision maker 
who desires to select from among a 
collection of available alternatives one 
that is, in some sense, optimal. These 
papers have two general features. 
First, there is specified a set of vari¬ 
ables and of empirical relations that 
link them. Usually some variables are 
under the control of the decision 
maker, while others are not. A com¬ 
bination of values of the variables 
satisfying the empirical relations is 
called a policy. Second, there is an 
objective function that permits the 
policies to be ranked according to 
their relative desirability. Together, 
the variables, empirical relations, and 
objective function constitute a model. 

One usually formulates a model of 
a real decision process with either or 
both of the following objectives in 
mind: first, one desires an insight into 
the qualitative properties of good deci¬ 
sion rules for the situation under study; 
second, one seeks an actual policy to 
use for some specified values of the 
uncontrollable variables. Sometimes 
these objectives may be realized by a 
mathematical analysis of the model 
without resorting to computation. 
However, their attainment usually re¬ 


quires, and is always facilitated by, 
computational procedures for finding 
an optimal policy—i.e., a policy that 
performs best as measured by the 
objective function. All of the papers 
in this volume present computational 
procedures for optimization and/or 
mathematical analyses directed to¬ 
ward qualitative characterizations of 
models. 

Abstract models, such as those 
formulated hr this volume, attempt to 
mirror the key features of certain con¬ 
crete situations. The models neces¬ 
sarily suppress many hopefully less 
significant details. Because a model is, 
at best, approximately equivalent in a 
formal sense to a concrete situation, 
it is to be expected that a policy that 
is optimal for a model will not be 
optimal for the original concrete situa¬ 
tion under study. Experience has 
shown, however, that it is often pos¬ 
sible to construct a model whose 
optimal policies perform quite well in 
the corresponding concrete situation. 
It is this fact that makes model con¬ 
struction useful. 

From a mathematical point of view, 
optimization of a model usually in¬ 
volves extremizing (maximizing or 
minimizing) a real valued function over 
a specified set. A particularly impor¬ 
tant example is the linear program¬ 
ming problem. It involves extremizing 
a linear function of finitely many real 
variables over a set defined by a finite 
number of linear restrictions. 
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AN INTRODUCTION 


An important feature of real deci¬ 
sion problems is the presence of un¬ 
certainties about the future. Models 
that explicitly allow for uncertainties 
are based on probability theory. The 
most sophisticated of these models 
allow explicitly for the fact that 
decisions made at one point in time 


must be based on less information 
than are decisions made at later points 
in time. 

This volume is separated into two 
parts. In Part One papers describing 
deterministic decision models are col¬ 
lected. Part Two includes papers dis¬ 
cussing stochastic decision models. 



PART ONE 

DETERMINISTIC DECISION MODELS 


I 

COMMENTARY ON PART ONE: 
DETERMINISTIC DECISION MODELS 


It is often convenient to develop models of decision problems in which, un¬ 
certainties are suppressed. The models in Part One have this feature. 

The models discussed in Sections II-Y have the following mathematical 
structure. One seeks an n coordinate vector of real numbers (or, briefly, an 
n-vector) x = (xi) that 


( 1 ) minimizes 

subject to 

c(x) 


(2) 

^ a'xi — b 

1=1 


(3) 

Xi > 0 

(i= 1 , 2 , • • •, n) 

where c is a given real valued function and a 1 , * • -, a n , b are given m-vectors. 
In Sections II and III and parts of Section V it is further assumed that (1) takes 
the, special form 

( 1 )' minimizes 

cx 



where c = (d) is a given n-vector . 1 In this case we have a linear programming 
problem. In Section IV, ( 1 ) takes the form 


( 1 )" minimizes J2 CijX&j + J2 

i—l j=l 1=1 

in which case the problem is one of quadratic programming. In this event c(x) 
is taken to be convex in x. 


Linear Programming 

A surprisingly large number of real decision problems can be studied fruit¬ 
fully by means of linear programming. The collection of papers given in Sec¬ 
tions II and III and parts of Sections V and VIII attests to this fact. Textbooks 
on linear programming and related topics are also available [5, 6 , 14, 15, 18, 20]. 

Duality and Existence Theorems in Linear Programming 

The main theorems of linear programming exploit the fact that linear pro¬ 
gramming problems come in pairs. By using the constants of the “primal” 

n 

1 If u = (m) and v = (vi) are n-vectors, the scalar product of the vectors is uv = u i v t- 

i —1 


5 
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problem (1)', (2), (3) we may construct a “dual” problem, which is to find an 
m-vector y that 

(4) maximizes yi 

subject to 

(5) ya* < a (*= 1 , 2 , 

It is important to note that the vectors a\ ■ ■ ■ , a n , b, c are the same in both 
the primal and dual problem. 

The vectors x* and y* are called optimal solutions to the primal and dual 
problems respectively if z* satisfies (1)', (2), (3) and y* satisfies (4), (5). The 
vectors x a,nd y are said to be feasible for the primal and dual problems re¬ 
spectively if x satisfies (2), (3) and y satisfies (5). The following are the main 
theorems of linear programming, from Gale, Kuhn, and Tucker [21c]. 

Existence Theorem 

There exist optimal solutions to the primal and dual problems if and only if 
there exist feasible solutions to the problems. 

Dual Theorem 

If Z* and y* are feasible solutions to the primal and dual problems respec- 
tively } then x* and y* are optimal if and only if 

( ft ) cx* = y*b. 

*The proof of the sufficiency of (6) as a condition for the optimality of x* and 
y is so simple that we reproduce it here. Let x and y be feasible for the primal 
and dual problems. Then by (2),. (3), and (5) 

^ cx > (ya')xi = y(jr a { Xi) = yb. 

** i \;=i / 

Setting y y in (7) and using (6) it follows that for any feasible x for the 
primal problem 

cx > y*b = cx*, 

which establishes the optimality of x*. A similar argument shows y* to be 
optimal. 

The dual theorem of linear programming is of great importance. The prin¬ 
ciple computational methods for solving linear and discrete dynamic program¬ 
ming problems are based upon it. The minimax theorem for two person zero 
sum matrix games is a simple corollary of it. 

. Th * pa P er Charnes and Cooper, Chapter 7, deals with certain problems 
in which there are multiple linear objective functions and relies heavily upon 
the dual theorem. The papers by Dantzig, Chapter 6, and by Ford and Fulker¬ 
son, Chapter 4, give examples of situations in which the dual of the “natural” 
problem has a special structure which can be exploited in computations. 



I—COMMENTARY ON PART ONE 


7 


The Simplex Method 

The most important computational procedure for solving linear programming 
problems is Dantzig’s famous simplex method [ 21 a]. Actually this method 
simultaneously solves both the primal and dual problems. We outline briefly 
the essentials of a typical iteration of this finite iterative procedure as applied 
to the problem of finding a pair of vectors (x, y) satisfying ( 1 )', (2)-(5). 

Let A = (a 1 , • * *, a n ). An iteration begins with the following information 
at hand when A has rank m: 

(i) the inverse B~ l of a basis matrix B whose columns are m linearly in¬ 
dependent columns of A (by relabeling we may assume that B = 
(a 1 , • • •, a m )); 

(ii) a vector x = (x* } 0 ) that is feasible for the primal problem and is such 
that x * = B-'b (x is called a basic feasible solution) ; 

(hi) a vector y = c*B~ x where c* = (ci, • • *, c m ) (the elements of y are 
variously called simplex multipliers or prices). 

Observe from (ii) and (iii) that 

cx = (yA)x = y(Ax) = yb , 

so that by the sufficiency of ( 6 ) as a condition for optimality, x and y are optimal 
if y is feasible for the dual problem. If y is not feasible for the dual problem, one 
then finds a new basis matrix B, inverse I?”" 1 , basic feasible solution 2 J, and 
vector y of simplex multipliers. This is done as follows. 

Select an integer s for which 

( 8 ) ya* > c*. 

Then compute f = B~~ l a* and let 

( 9 ) 2 = x + e |u, — ^ 

where u 8 has +1 hi the 5 th position and zeroes elsewhere, and where 0 (>O) is 
the largest number 2 for which x > 0. Observe from (ii) and (9) that 

Ax = Ax + $[a 8 — a 8 ] = b , 

so x is feasible for the primal problem. Using ( 8 ) one finds that 

(10) cx = cx + 0[c* — c*jB“ 1 a 8 ] = cx + d[c 8 — ya 8 ] < cx 

so that the new feasible solution x is an improvement over x . 

In the usual non-degenerate case we may assume that $ > 0 and that there 
is a unique integer r, 1 < r < m, for which x r = 0. We call this the non¬ 
degeneracy hypothesis. Thus, in order for T to be a basic feasible solution for 
B ) 5 must be formed by replacing the column vector a r in B by a 8 . 

2 If x > 0 for all 6 > 0, then the objective function (1)' of the primal problem can be 
made arbitrarily large and there is no feasible solution for the dual problem. 
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One way to form £ from B is to postmultiply B by E where E is the ele¬ 
mentary matrix obtained by replacing the rth column of the m X m identity 

L E-i n B* = BK Thus> if we “ find E "’ we ma y 

Now interchange the rth and sth columns of A, and the rth and sth co¬ 
ordinates of c and of 2 defined in (9). We shall then have 5 = (**, 0 ) with 
2 - B b and £ = 1 where S* is formed from c* by replacing c r by c,. 

1 t ? comj>letes an Oration except for constructing E~ 1 ) which we now do. 

It follows from the non-degeneracy hypothesis that the rth element of t* 
say t T . is positive. Now replace t„ in f by -1 and multiply the resulting vector 
by l/tr,. Then substitute the vector so obtained for the rth column of the 

EE~™1 T 7 matrfX ' The ^ ^ E ~ l} WhiCh ° ne Ch6CkS by VCrifyillg that 

If there is a sequence of iterations that starts and ends with a common basic 
feasible sdution, then by (10) the objective function remains constant through¬ 
out that sequence. But under the non-degeneracy hypothesis, (10) is a strict 
mequahty so that no basic feasible solution can recur. Because there are a 
finite number of basic feasible solutions (no more than n\/(n - mV m' the 
number of combinations of n vectors taken m at a time), the simplex method 
therefore inust terminate after a finite number of iterations. Thus, if the dual 
p °blem is feasible, the simplex method terminates with vectors x* and y* that 
are feasible for the primal and dual problems and that satisfy (6). Hence (6) is 
a nec^sary (as well as sufficient) condition for optimality. Thus, the simplex 
method provides a constructive proof of the dual theorem. ^ 

Large Scale Linear Programming Problems 

, ;r 7 he sun P lex method has turned out to be an efficient procedure for solving 
teaear programming probiems on digital computers when there are no more 
than a few hundred constraints. However, in many practical problems the 
ons rain s number in the thousands. Some techniques for dealing with such 
problems are discussed by Dantzig in Chapter 6 and in [16a], 

n J?- 6 m ° St 6XC J ting , idea for dealin g with large scale problems is the decom- 
poatwn pnneiple of Dantzig and Wolfe [8]. Their work was stimulated in 
part by the important paper of Ford and Fulkerson, Chapter 3. 

Transportation and Network Problems 

un^oTtViTm 11 ! SP6 ^ ml structure in hnear programming arises when each col- 
rri the Composed on one + 1 - one -1, and zeroes elsewhere 

The sn^ l Wlth tbls ® truc u ture 18 called a transhipment problem in Chapter 1 
The speml ^ ^ ^hipment problem in which no row of A contains both 

ZJ + T* ~i Cal ! ed a transportation problem. Both names derive from the 
act that certain problems of choosing optimal routes for shipping the stocks of 
a product from one set of locations to another can be formulated as W 
programming problems with the above structure. 

The importance of the transhipment problem derives from two facts. First, 
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special computational procedures that exist for solving the problem are sig¬ 
nificantly more efficient than direct application of the simplex method. Indeed, 
transhipment problems with several thousand constraints are within the range 
of current computing equipment. Second, many problems that on the surface 
seem entirely unrelated to the distribution problems indicated above have the 
transhipment structure in the “natural” formulation of the problem. This is 
the case with the caterer problem (Chapters 18, 19, 20) and the dual of the 
project cost curve problem (Chapter 4), as well as with a large number of other 
problems [12, 13, 25]. 

Orden shows in Chapter 1 that any transhipment problem can be reduced to 
an equivalent transportation problem. This reduction is of interest because it 
extends the applicability of the many algorithms that have been devised for 
solving transportation problems to transhipment problems [12, 21b]. 

An important property of transhipment problems is that whenever every 
coordinate of b is an integer, the same is true of every basic feasible solution of 
the primal problem. This property permits many combinatorial problems to 
be solved by formulating them as transhipment problems. 

It is often of interest to solve transhipment problems in which there are 
additional constraints of a simple form—e.g., upper bounds on certain variables 
or sums of variables. Wagner shows in Chapter 2 that a transhipment problem 
with upper bounds on certain partial sums of variables can be reduced to an 
ordinary transportation problem. In Chapter 5 Fulkerson investigates the 
problem of determining an optimal way of increasing the capacity of a network 
subject to a budget constraint. He solves the problem by solving a sequence of 
transhipment problems. 


Deterministic Dynamic Programming 

One special type of transhipment problem may be formulated as follows. 
We are given u nodes labeled 1, 2, • • *, u and a collection of ordered pairs (i } j) 
of nodes called arcs. There is associated with each arc (i, j) in the network a 
number c»y representing the cost of shipping one unit of product from node i to 
node j along the arc (i, j). The problem is to ship one unit of product from node 
1 to node n as cheaply as possible. 

To formulate the problem we let s# be the total amount of product shipped 
from node i to node j along the arc (i,i). We then seek xa that 


(11) minimize 
subject to 


( 12 ) 


CijXij 

Z = 1 

Z ~~ Z x n = 0 (t = 2, 3, • * * , n — 1) 

i 5 

Z x i n ~ 1 


Xij > 0 


(13) 


(i,j = 1,2, *“,n). 
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The dual of this problem is to choose numbers fi, ■ • • ,f n that 
(14) maximize f 1 _ j n 

subject to 


(15) 


fi — fi < Ci- 


(i,j = 1,2, •••,»). 


' ~ ' / 3 - 

Because the last equation in ( 12 ) may be obtained by summing the first 
n ^ 1 equations, the last equation is redundant and may be omitted This 
omission is equivalent to deleting /„ from the dual problem, or, as we shall 
assume, setting/,, = 0 . 

As we have suggested earlier, every basic feasible solution to ( 12 ) (13) will 
involve mteger x i5 ’ s. In the present case each such x {j is either 0 or 1 . Further¬ 
more, a subcollection of the arcs (i,j) for which x tj = 1 will form a “path” 
,!° m , n ° de 1 to node n ~ iLe -> tbe subcollection of arcs takes the form ( 1 , k x ) 

L+h f ' * ’ 'i , f' > ' F ? r au optimal basic feasible solution, the associated 
path from node 1 to node n will, of course, be a minimal cost path from node 1 
to node n. 

« the P^alanddual problems are feasible, one optimal set of dual variables, 
fi say, will satisfy the familiar recurrence relations of dynamic programming 

^ ^ /* nun {cn + fj } (i ~ 1, • • ■, n — 1 ). 

We may interpret as the total cost of shipping one unit from node * to node » 

along the cheapest possible path. 

J. h 7;° blem 5°T lated ab0Ve is actuall y a Prototype of virtually all dis¬ 
crete deterministic dynamic programming problems. In the dynamic program- 

“!^ g P ,° int ° fvie ^ one considers a Process that may be in one of n states. One is 

ad d 7i ° m0ve th / pr 0 C f S from state * to 3 at a cost The problem is to 
gu de the process from state 1 through a sequence of intermediate states to 

toloLmttT 7 ?! possil J le - 1 identifying states with nodes and decisions 
to go from * to j with arcs (z,j), the equivalence of discrete dynamic program- 

tTaTdTscrete H path P roblem 18 immediate. Indeed we see also 

t discrete deterministic dynamic programming is simply a branch of linear 
programming m which duality plays a central role. 

lead's a discrete deterministic dynamic programming formulation 

eads to an acyclic network-i.e., a network in which an arc (f, j) is not admitted 
7 ^ For these cases one may calculate an optimal set of dual variables 

in that oXT 

erence to the interpretations given above or, formally, by the theorem of 
athand^ V6 ’ “ ° Ptimal S ° luti ° n t0 the primal P roblem is then immediately 

Charnes and Cooper [4] first employed the foregoing method to solve the 
of Drohlf Pr °f G “'. Cbapter 6 » Dantzig extends the method to a broad class 
° f Whlcb the problem ab ove is a special case. Although 

nes, Cooper, and Dantzig recognized the computational advantages of this 
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solution method, it remained for d’Epenoux [9] to develop the dynamic pro¬ 
gramming interpretation offered here. 

In much of the literature of dynamic programming a generalization of (16) 
is studied in which the number of possible states and decisions is infinite. This 
study amounts to considering a minimal cost path problem with infinitely many 
nodes and arcs. In order to actually solve such problems, however, one is 
forced to use a finite approximation or to exploit the structure of the problem to 
simplify the computations. Excellent examples of this latter approach in certain 
problems of production planning and inventory control are given by Bellman 
in Chapter 18, Dreyfus in Chapter 14, and Wagner and Whitin in Chapter 16. 

Quadratic Programming 

The quadratic programming problem arises as an approximation to the 
problem of minimizing a convex function subject to linear inequalities. The 
problem also arises directly in certain circumstances. This is the case, for 
example, where the coefficients of the objective function of the linear pro¬ 
gramming problem (1)', (2), (3) are random variables and where x must be 
chosen before the values of those random variables are known. In this event it is 
generally impossible to choose a single x that solves (1)', (2), (3) for all c. If 
instead one adopts the reasonable alternative of choosing an x that minimizes 
the expected value of cx , a linear programming problem is again obtained. 
If, however, one desires to have a small probability of obtaining very high 
values of the objective function while at the same time achieving a low expected 
value of the objective function, a different approach is called for. It is then 
reasonable to seek to minimize the variance X» Xi CijXiXj of cx (c# is the co- 
variance of C{ and cy) subject to a restriction that the expected value of cx not 
exceed a certain maximal level. The resulting problem is clearly one in quadratic 
programming. This application was initially suggested by Markowitz [23] in 
his analysis of efficient investment portfolios. 

The first algorithm for solving the quadratic programming problem was 
given by Beale [2]. Since that time several other algorithms have been proposed. 
Many of these are similar to the simplex method in various respects. One 
proposal by Dantzig [7] is a direct generalization of the simplex method. By 
this statement is meant that if c%j = 0 for all i,j in (l)", the resulting applica¬ 
tion of Dantzig’s algorithm is equivalent to the usual simplex method. 

A generalization of the dual theorem of linear programming has been de¬ 
veloped for quadratic programming as well as more general nonlinear problems. 
Dorn [10] surveys these results and discusses several of the proposed algorithms 
for solving quadratic programming problems. 

Among the proposals for solving quadratic programming problems surveyed 
by Dorn are those presented by Theil and Van de Panne in Chapter 9 and 
Lemke in Chapter 11. These two algorithms have the common feature that 
both begin by finding the absolute minimum of the objective function without 
regard to the constraints. If this solution satisfies the constraints, it is optimal; 
if not, the algorithms proceed separately. Theil and Van de Panne give a 
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systematic procedure for adding constraints until an optimal solution is found. 
An alternative justification for this procedure using the Kuhn-Tucker con¬ 
ditions is given by Boot in Chapter 10 . Lemke instead uses the fact that the 
dual problem (also a quadratic programming problem) can be simplified after 
the absolute minimum of the primal objective function is found. He then 
develops an algorithm for solving the simplified dual problem in which the 
dual is feasible at each step. There are corresponding solutions of the pr imal 
problem which are infeasible until the final step, at which an optimal solution 
is found. These features are reminiscent of Lemke’s dual simplex method [ 22 ] 
for solving linear programming problems. 


Production and Inventory Control 

The most fertile field for applications in the management sciences has been 
production and inventory control. There are now several books [1, 17, 19, 24] 
available which give accounts of recent research in the field. A variety of models 
have been proposed of which those in Section Y are representative. Each paper 
in Section V develops a specially designed algorithm for solving a problem 
more efficiently than would be possible with an appropriate, general purpose 
algorithm like the simplex method. We shall here attempt to point out rela¬ 
tionships between the papers in Section V by formulating a model that includes 
as special cases several of the closely related models in that section. 

Consider the problem of finding numbers Pi, • • •, p n , s x , • • •, s n> y u ■ • •, y n 
that minimize 


(17) 

n 

l C i(Pi) + + h{(yi)] 



subject to (sq = 0) 

i= 1 



(18) 

Hi = Vi— 1 + fi — Sf_i 

(i = 1 , • 

■ • ,n) 

(19) 

Vi < Vi < V 

(*' = 1 , • 

■■ ,n) 

(20) 

0 < Pi 

(i =l,-< 

,n) 

( 21 ) 

Si < Si < Si, Si < yi 

(*=!,-■ 

• ,n) 


where g i} a, and h { are given continuous, real, valued functions and where 

~ U ’ ~ n> ~ u ''' ’ ~ n ’ 5l> ‘ ‘ ‘ ’ 5n are Si yen constants. We may interpret 
this problem as one of choosing the amounts of a single product to produce and 
sell during each of n successive time periods 1 , 2 , - •., n so as to minimize the 
total manufacturing and selling costs over those periods while satisfying given 
capacity constraints. In this interpretation Vi is the amount produced at the 
beginning of period i, Si is the amount sold at the end of period i, and y { is the 
inventory on hand after production but before sales in period i. The production 
cost function for period i is a; 9i is the sales cost function for period i 3 and hi 
is the inventory carrying cost function for period i. 


tbl ®n- e SaIeS r T nUe if l a negative cost > 3i is usually negative; g { accounts for 
the selling pnce of a product may vary with the amount sold. 


me iact mat 
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Table I gives various specializations of the above problems that are discussed 
in Chapters 12-16 and 33. 


TABLE I 


Chapter 

Ci(Pi) 

ffi(si) 

hi(vd 

V 

Vi 

8 i 


12, 33 

convex 

0 

linear 

oo 

0 

= Si 

> 0 

13 

linear 

linear 

convex 


> 0 


0 

stationary 

stationary 



14 

linear 

linear 

linear 

^ 0 

0 

00 

0 

15* 

quadratic 

(convex) 

quadratic 

(convex) 

quadratic 

(convex) 

oo 

— 00 

00 

— oo 

16 

concavef 

0 

linear 

00 

0 

= Si 

> 0 


* In. this model (19), (20), and (21) are omitted. 

t The concave cost function in Chapter 16 is assumed to take the special form 

(0 , Vi - 0 

c.(© ■) = •s 

* 4 U+VJ> 4 ., P .>0 

where K > 0. 

The assumption that s* = $i(i — 1, • * *, n) in Johnson’s paper (Chapter 12) 
means that the sales to be realized in each period are known in advance. The 
problem is then to plan production to meet sales at minimal cost. Johnson 
devises an extremely simple algorithm for solving his problem. The procedure 
is simply to satisfy each unit of sales in order of occurrence as cheaply as pos¬ 
sible. An alternative procedure for solving the same problem is given by Charnes, 
Cooper, and Symonds in Section 7 of Chapter 33. An extension of Johnson’s 
method is shown to be applicable by Wagner [27] where the planner is allowed 
to vary his volume of sales by adjusting his selling price. 

In Johnson’s paper, inventories are held as a means of satisfying future re¬ 
quirements as cheaply as possible. By contrast, in Chapter 13 Karush and 
Vazsonyi consider situations in which inventories are held to provide service— 
e.g., inventories of labor and equipment. The problem is then to plan changes 
in inventory levels to meet fluctuating but known requirements for service in 
each period as cheaply as possible. Karush and Vazsonyi’s simple algorithm 
for solving their problem is based on an interesting property of the optimal 
inventory levels, viz., that the inventory levels equal the minimal required levels 
except during intervals over which the inventory level is held constant. Their 
procedure is essentially to search through schedules with the above property 
to find one that is optimal. Recent work on this problem has been described by 
Veinott and Wagner [25]. 

Dreyfus develops an interesting dynamic programming algorithm for solving 
the warehousing problem in Chapter 14. In this model selling and buying 
prices vary over time in a known way. The problem is to time buying and 
selling of stock so as to minimize total costs while assuring that the inventory 
levels never exceed a fixed storage limit. An alternative algorithm for solving 
this problem was proposed by Charnes and Cooper [4]. Their algorithm is out- 
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lined by Dantzig in Chapter 6. Further study of the problem has been given by 
Veinott and Wagner [25]. 

Veinott [26] has made a study encompassing all three models discussed above. 

The production planning problem studied by Holt, Modigliani, and Muth in 
Chapter 15 is more general than indicated in Table 1. In particular, inventories 
of product and labor are dealt with separately and a cost of changing the size 
of the labor force is introduced. The interesting consequence of their assumption 
of quadratic cost • functions and unconstrained variables is that the optimal 
inventory and labor force levels in one period are linear functions of those 
same quantities in the preceding period. An excellent book [19] gives a com¬ 
plete account of the use of quadratic cost functions in production planning. 

The model studied by Wagner and Whitin in Chapter 16 differs from those 
discussed above in that the cost function is concave. This concavity means 
that an optimal solution occurs at an extreme point of the set of feasible solu¬ 
tions. Wagner and Whitin show that there is an optimal production schedule 
with the property that if there is a positive inventory level at the beginning of 
a period, then no production takes place during that period. This property was 
first noted by Manne in the article reprinted as Chapter 17. Wagner and 
Whitin find an optimal policy with the aid of the dynamic programming re¬ 
currence relation (16) where c*y is the cost of ordering enough stock in period i 
to satisfy the requirements in periods i, • • *, j — 1 plus the cost of carrying 
inventory over that interval. The hypotheses of the model are further weakened 
in an article by Wagner [27]. 

Manne (Chapter 17) studies a generalization of the model proposed by 
Wagner and Whitin in which several products are permitted and labor is rationed 
among the products so as to minimize costs. Manne’s (approximate) formula- 
tion of the problem as one in linear programming is a good example of how a 
clever choice of variables can make a seemingly intractable problem solvable. 

Bellman (Chapter 18) and Prager (Chapter 19) give algorithms for solving 
the caterer problem, a variation of a special case of the problem studied by 
Karush and Vazsonyi in Chapter 13. In the caterer problem, one assumes that 
gi and hi are identically zero. One also assumes that after a unit of product 
provides service for one period, it must undergo repair during several periods 
before it can be used again. This contrasts with the assumption of Chapter 13 
that necessary repair time is negligible. There are two types of repair service, 
one fast and one slow, with the former being more expensive. 

As Beale suggests in Chapter 20, Prager’s algorithm in Chapter 19 exploits 
the following property of an optimal solution: Suppose we fix the total number 
of units produced over the n periods. Then the optimal way of providing 
service is to satisfy each unit of service requirements in order of occurrence 
according to the rule: First use product that has not previously been used; 
second, use product that can be repaired on slow service and be available for 
use; finally, use product that can be repaired only on fast service starting with 
the product that was used latest. This procedure can be interpreted as satisfy- 
ing each unit of service in order of occurrence as cheaply as possible if the cost 
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of production is taken to be zero. 4 Notice that with this interpretation Prager’s 
procedure is similar to that used by Johnson in Chapter 12 on a related prob¬ 
lem discussed above. 

Derman and Klein (Chapter 21) and Lieberman (Chapter 22) consider in¬ 
ventory problems in which there is initially a stockpile of items of different 
ages that are to be used as efficiently as possible. In one problem the useful 
service life of an item is a function of its age at issue. One then seeks to select 
the order of issuing the items so as to maximize the total service life from 
all items when used sequentially. In a second problem the future demands for 
the product are known and the return received from issuing an item to satisfy 
a demand is a function of the age of the item at the time the demand occurs. 
This time, one seeks to select the order of issuing items so as to maximize the 
total return received from all items. For both problems conditions are given 
which ensure the optimality of first-in-first-out (FIFO) and last-in-first-out 
(LIFO) issuing policies. Subsequent investigations of these problems are given 
in a number of papers [3, 11, 28]. 
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THE TRANSHIPMENT PROBLEM 

ALEX ORDEN 

Burroughs Corporation 
1. Introduction 

The “transportation problem” in linear programming refers to a class of 
linear programming problems whose first example was that of selection of most 
economical shipping routes for transfer of single commodity from a number of 
sources to a number of destinations. A number of linear programming models 
have since been developed involving manpower assignment, machine loading, 
and others, in which the algebraic equations are identical in form to those of the 
original transportation problem. Currently it appears that this special class 
covers the majority of the applications of linear programming which are in 
practical use or under active development. 1 We will refer to all linear program¬ 
ming problems which fall into this mathematical mold as “transportation prob¬ 
lems”. These problems take the form 


22 x^ 

= CLi 

II 

m 

m 

IbI 

= h 

j = 1 * ‘ 

* n 


Xij ^ 0 

22 on = Min. or Max. 


where , bj , and are given parameters. 

In the course of work which has led to transportation problem type algebraic 
formulations for various applications it has become clear that this class of linear 
programming problems has important distinguishing characteristics. Perhaps 
the most valuable from a practical point of view is that methods of solution have 
been developed which make it possible to handle problems with a large number 
of unknowns, . (Problems with m and n up to about 200, i.e. involving 40,000 
Xi/s can be handled on high speed computing machines.) An equally significant 
characteristic is that the transportation problem offers an approach to some 
problems which appear at first to be purely combinatorial. Specifically, the so- 
called “assignment problem”; e.g., optimal assignment of men to jobs, can be 
formulated as a transportation problem. 2 

The connection of the transportation problem to combinatorial problems is a 

1 L. W. Smith, “Current Status of the Industrial Use of Linear Programming”, Manage¬ 
ment Science , Vol. II, No. 2. 

2 D. F. Votaw and A. Orden, “The Personnel Assignment Problem”, Symposium on 
Linear Inequalities and Programming, Hq. U. S. Air Force, (1952) pp. 155-163. 
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strong one. In the “assignment problem” one asks for the most efficient assign¬ 
ment combination. When this combinatorial problem is formulated in the algebra 
of the transportation problem, it appears at first that the solution will permit 
fractional assignments, i.e. might divide each man’s time among several jobs. 
It turns out, however, that the linear programming solution always leads to a 
one man to one job solution, just as though only combinations of this type were 
eligible. 

Thus the transportation problem has offered two mathematical facets: 

(1) as a specialized type of linear programming problem, 

(2) as a method of representation of some combinatorial problems. 

The distinction between these two aspects of the problem may not appear 
important in applications. Both aspects deal with selection problems which call 
for an optimum pair-wise relation of one group of items to another group, as in 
relating shipping points to destinations in the original transportation problem. 
The main difference between the linear programming aspect and the com¬ 
binatorial aspect is that the latter calls specifically for taking advantage of the 
fact that integer values of the . quantities a* and bj always provides at least one 
optimum solution with integer values of the xa . 

In this paper a third aspect of the mathematical properties of the transporta- 
tion problem is developed. It is shown that the same mathematical framework 
can be extended beyond pair-wise connections, to the determination of optimum 
linked paths over a series of points. This extension although viewed here as a 
linear programming problem, takes advantage of the combinatorial aspect of the 
transportation problem, and applications may arise which, like the assignment 
problem, appear to be combinatorial problems, but which can be solved by 
linear programming. 

The main part of the treatment which follows is in terms of extension of the 
original transportation problem to include the possibility of transhipment; i.e. 
any shipping or receiving point is also permitted to act as an intermediate point 
in seeking an optimum solution. A supplementary example (Section 6) isolates 
this extension of the transportation problem more specifically as a problem of 
determination of an optimum linked path. The transhipment technique is used 
to find the shortest route from one point in a network to another. 

2. The Transhipment Procedure 

The original transportation problem deals with selection of shipping routes 
so as to minimize the cost of shipping a uniform commodity from specified 
origins to specified destinations. The amounts to be sent from each origin, the 
amounts to be received by each destination, and the cost per unit shipped from 
any origin to any destination are specified. Transhipment is not considered, that 
is, each point acts as a shipper only or as a receiver only. The problem without 
transhipment wifi be denoted in this paper by “T 0 ”. In extending the problem 
to permit transhipment the situation is the same as in “T 0 ” with the additional 
feature that shipments may go via any sequence of points rather than being 
restricted to direct connections from one of the origins to one of the destinations. 
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We will denote this problem by “7Y\ The transhipment problem will be solved 
by converting it in a specified way to a computation problem which has the form 
of To. After conversion to a To problem, Dantzig’s simplex technique 3 provides 
a satisfactory computation technique. That technique and variants of it have 
been coded for solution on several electronic digital computers. 

The general nature of the solution is as follows: In To there are distinct ship¬ 
pers and receivers. The transhipment problem, Ti , is to be converted to the 
form of To by treating each point as a pair of points, one acting as a shipper 
and one as a receiver. The unit cost of shipment from a point considered as a 
shipper to the same point considered as a receiver is set equal to zero. It is as¬ 
sumed (for computation purposes only) that a large amount of the material 
to be shipped is available at each point and acts as a stockpile which can be 
drawn or replenished. The solution to the transhipment problem lies in the 
fact that withdrawals from and compensating additions to the stockpiles are 
equivalent to transhipment. The stockpile sizes do not matter provided they 
are large enough to permit all desirable shipments which can reduce the cost 
(see Sec. 4). In the computation excessively large stockpiles are arbitrarily 
introduced. The excesses of stockpiles over amounts actually shipped drop out 
of the final solution (they appear as shipments from a point to itself at zero 
cost). The procedure is as follows: 

Step 1 . Problem formulation 

Assume M points which are either shippers or receivers. 

Let gi = < 7 i, < 72 , • • • , Qm be specified net amounts to be shipped by each point, 
where some gi are positive and some negative, and X) (< 7 i I Q% > 0 ) = 
—S ( Q% | Qi < 0 ), i.e. the total to be shipped is equal to the total to be received. 

Let a* = amount shipped by each point including transhipment and b* = 
amount received by each point including transhipment; and bi are not yet 
specified but must satisfy: 

(1) gi = a,i - h i = 1, 2, • ♦ • M 

Let cn for i 9 ^ j be specified unit costs of shipment from point i to point j. 
All these ca are assumed > 0. 

Let cu = 0 be the unit cost of shipping from a point to itself. 

Step 2. Set up the transhipment problem in the form of a To problem (without 
transhipment) in which there are M origins and M destinations. The amounts 
shipped, di , and the amounts received, bi , must be ^ 0 . On the basis of the 
specified net amounts, gi , the smallest value which can be used as a; and bi are 
a? and bi° as defined by (2). 

(2) If gi > 0, set a f = g { , bi° = 0 
If gi < 0, set di° = 0, bi° = | 0 t -1 

These a * 0 and bi° satisfy ( 1 ). 

3 G. B. Dantzig, Application of the Simplex Method to a Transportation Problem”, 
Chapter XXIII of Activity Analysis of Production and Allocation , John Wiley & Co., 1951. 

A. Charnes and W. W. Cooper, “The Stepping Stone Method of Explaining Linear 
Programming Calculations in Transportation Problems”, Management Science , Vol. I, 
No. 1. 
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(3) Let fa/ = a/ + s 

\W = 6/ + s 

where s is a positive constant (a stock pile at each point). 

The a/ and &/ defined by (3) are to be used in the computation, s is to be 
large enough to permit all possible transhipments, viz., by adding s to o,°, the 
total shipped, a/, may include (a) the net amount, g t , originally specified, (b) 
additional amounts received in the course of transhipment, and (c) redundant 
amounts which are shipped from the point to itself at zero cost. 

It will be shown in Section 4 that s must be taken large enough that in the 
minimum cost solution all ** differ from zero. A suitable value for s, since it 
obviously introduces artificial stockpiles which equal any possible tranship¬ 
ments, is: 


( 4 ) « = E (ffi \g t > 0) = - £ (gi | Si < 0) = J f) | g { I 

i=»l 

Step 3. Compute the min. cost solution to the T 0 problem defined by (3) and 
the Ci/s. Let C' be the total cost and x iS ' be shipments of the min. cost solution 
(all xu' 0). 

Step 4. The final *«' are redundant. Discard them as follows: Each Xu' is 
contained in both the amount shipped, a/, and the amount received, &/, and 
may be deducted from both. Replace (3) by: 


( 5 ) 


a" 

V 



Xu 


Then b, feasible solution to the To problem given by (5) is 


( 6 ) 


Xij — Xij 

Xii = 0 


for i 7* j 


where *</ are the values obtained by computation in Step 3. 

Corresponding to any possible solution to the T 0 problem based on the of and 
'll ^ ere * S . a ^ eas ^ e s °lution to (3) with the same cost, namely introduce 
the diagonal” shipments *«'. Therefore, since C’ is the Min. cost for (3), it must 
also be the Min. cost for (5). 

Step 5. Convert the results in the form (5) and (6) to the solution to the tran- 
shipment problem. This conversion is straightforward since (5) gives the total 
amounts shipped and received at each point, and (6) gives the amounts shipped 
along each route. (See the example in the next section.) 

The Mm. cost solution involves (M - 1) point-to-point paths for which x {j ^ 
U. (in degenerate cases the number of paths may be smaller.) It will be shown 
later that the Mm. cost solution to the transhipment problem requires no more 
paths than would be required if only direct shipment were permitted. 


3. Example of the Transhipment Problem 

The direct shipment solution and the transhipment solution to a small problem 
will be compared. There are 5 shipping and receiving points involved. They will 
be denoted by Q, R, S, T, U. y 



II-l—THE TRANSHIPMENT PROBLEM 


21 


TABLE IA 

Direct Shipment Problem 


ai 

5 (jP)1 amounts 

6 (U)) shipped 


(Q) (E) (S) 

hi 2 5 4 

amounts received 

TABLE IB 

Min. Cost Solution to the Direct Shipment Problem 

ai 

5 (T) n _ 

6 (CO Cost " 56 


The direct shipment problem, a To problem, is shown by Table IA. The body 
of the table contains the unit shipping costs, c t y, and the marginal numbers 
are the amounts to be shipped and received. The solution is shown in Table IB, 
where the body of the table contains the for minimum total cost. The simplex 
technique for To was used to obtain the solution. 

The transhipment problem is shown in Table IIA. The two shippers and three 
receivers of the direct shipment problem become five points which are both 
shippers and receivers. The a* and b } - of Table I become 

gi = —2 (for point Q), g 2 = —5 (i?), gz = —4 ( S ), g 4 = +5 (T), gs = +6 ( U ). 

The a/ and hi in Table II have been set up according to Eq. (3) with s = 20. 
The Cij s enclosed by bold rule are the same as in Table I. The other c*/s are 
additional unit cost information required when transhipment is permitted. All 
ca — 0. The minimum cost solution, obtained by Dantzig’s method, is given in 
Table IIB. 

Using (5) to eliminate the and to reduce the a / and 6/ accordingly to 
a ” and 5/', the result is shown by Table IIC. The minimum cost of 56 in the 
direct shipment solution is reduced to 52 by the transhipment solution. If the 
problem were non-degenerate there would be four non-zero Zif s, i.e. four routes 
required, in both the direct shipment and the transhipment problems. In this 
example both forms of the problem turn out to be degenerate, requiring only 
three routes. 

The transhipment routes are easily obtained from Table IIC. Point U ships 
6 units to S. But S is to receive a net of 4, (6»°); therefore 2 units are available 
for transhipment. This provides the 2 units, (a/), which are shipped from S to 
Q. There appears to be no difficulty involved in carrying out this type of resolu¬ 
tion of transhipment paths in larger problems. 
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Min. Cost Solution to the Transhipment Problem 
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Final Solution to the Transhipment Problem 
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4. Proof that the Procedure Reaches the Minimum 

A proof follows that the procedure described above determines the minimum 
cost for the transhipment problem. 

Step 1. The possible range of values of s for use in (3) is 0 ^ s < + «>. For 
any $ let C(s) be the minimum cost solution to the To problem specified by (3). 
For any s 2 > Si we have: 

00 C(« 2 ) ^ (?(*) 

Proof of (7): The minimum cost solution using si becomes a feasible solution 
for S 2 by increasing all xu by (s 2 — s x ). The change in the xu does not affect the 
cost, therefore C(s 2 ) is certainly no larger than C(si). C(s 2 ) may however be less 
than C(si). 

Thus C(s) is a monotone decreasing function. Assuming all c# § 0, it is 
bounded from below, therefore it has a greatest lower bound. Let C* = g.l.b. 
C(s). The object of the computation is to find C*, which is the minimum cost for 
the transhipment problem. 

It will be 'proved that at some s the monotone decreasing function C (s) becomes a 
constant; i.e. reaches C*. The point is that C(s) does not approach C* asymptot¬ 
ically in any manner, e.g., as a curve or as a series of line segments which 
approach nearer and nearer to C*, but actually reaches and remains at C*. If 
the situation were asymptotic, the minimum cost could be approached but 
would not actually be reached. 

Step 2. A value of s can be found such that the minimum cost solution to (3) has 
all x%% 0 

Proof: It has been assumed that all c%j > 0 for i j. Choose a positive constant, 
Co, such that Co < c # for all these c# . Suppose for all $ in (0 ^ s < oo) that at 
least one xu = 0. Then, for any s we would have: 

C(s) > CoS 

since in some row where xu = 0 all costs for material shipped would be > Co. 
Since s can be made arbitrarily large this would contradict (7), therefore for 
large s we have all xu ^ 0. 

Step 3. Consider some Si which has the property specified by Step 2, that the 
Min. cost solution to (3) has all xu 0. Then, for all s > Si we have: 

(8) C(s) = C(s x ) = C* 

Proof: Let x^ (si) be a Min. cost solution for s x . A feasible solution for any 
s 2 > Si is: 

(9) (xu (s 2 ) = Xij (si) for i j 

M = Xu (si) + (s 2 — Si) 

The theory of Dantzig’s simplex technique for solution of To can be used as 
follows to show that (9) is a Min. cost solution for s 2 : Let Ui and Vi be the final 
set of u’s and v’s obtained in the simplex process in solving (3) using si ; these 





24 


II-l—DETERMINISTIC DECISION MODELS 


Wt an Vi satisfy u t + v,- g c iy . The same set of u’s and p’s can be applied to the 
computation far*, and the relations u f + v,- g Cij are still satisfied since the c,- 
re unchanged. Therefore the i, j locations of the Min. cost “basic solution” for 

% ™ Sa “ e “ for • Since the xJb of a “basic solution” are unique, (9) is 
the Min. cost solution for s 2 .< Bv (Q) CM - C(* \ i 7 

difFar r * , j Z ° W ~ c W since the only which 

j ” 5 w ^ uc ^ 1 n °t affect the cost. Since $ 2 is any s > $i, and C(s) 
is monotone decreasing, we have (8). ^ 

Khi I l h “, thuS , 1 ? een 8h «wn that by choice of a large s the procedure for the tran¬ 
shipment problem gives the minimum cost. It can be seen intuitively that (4) 

XtZ? 2 n n ° Ugh V f UG fOT S ' An ^ 8 which ***** to a solution to (3) for 
which aU xu 7 * 0 is satisfactory. V ; 

5. Number of Direct Routes 

nuS2. ld f & I° proble “ , in which there are m origins and n destinations. The 

"* ***** “<“ + *- « to ”»• 

M?lTT?T di “, 8 7 a ° 8MP °!, e, “ problem T " “ M POiuts where 
, + n - Tte solution procedure makes use of M origins and M destinations 

t t0 a r 2 1Uti 2 hlVOlVing m ~ 1 Paths - 0f these ’ however > 

Zeoi M J ^ r f UCe L d t0 zero - Thus the final solution to T, makes 

use of Af l _ m + n _ x paths> which jg the game M for the ^ problem 

6. Optimum Linked Paths 

It has been assumed in the preceding sections that unit costs are given for 

refer toXco^r f T ^ t0 aDy 0ther ‘ ^ c ” s for Sample, 
reter to the cost of non-stop airplane transportation. In general, however many 

rf the cl. inrtndly given wonld be for routes which involve t^STe 

which pass through one or more en-route points. In the latter case the given data 

for , to , would be the unit cost, c„, and a route, (i, ,') mXih! 

given co s are for direct shipments or involve transhipments they are not neces 

sanly the lowest possible costa for e^h of the linls! to”' Sre H 

f r ° f the " **>*■* eS, of ti“ 

of ieprlll^S£ r ,7, “7 *' t0 ’’ wkkt k not known at the start 

C- Sd S y C<y • Ut 4 mdjl be P articula ^ points for which 

d >s initially given ^ ^ ^ ^ & transhi P m ent Problem using the 


and 


9i = 


1 for i = i t 
-1 for i = j x 
0 for all other i 


7HnSVr^ n ^?g7oblm 8 haVe b6en 8tated dir6Ctly “ 




TABLE IIIC 

Reduced Solution Xu Values 
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i.e. a transhipment problem in which the net shipment is +1 at i x , -1 at j\ and 
zero at all other points. The resulting c il j l will have associated with it a route 
0‘i, i ', i" , * • *ii). 

Suppose all the best routes c tJ * were found. Then if any set of net shipments, 
Qi > were specified, it would not be necessary to solve the transhipment problem 
using the c</s initially given. One would, instead, set up a T 0 problem in which 
the shipments were a { = (g { j g { > 0 ), receipts were b, = (~gi | g { < 0 ), and 
use the least unit costs, c#, already known, in place of the original c»/s. The 
solution would contain transhipment information in the form of the routes 
0, i f", * * • i) associated with each , but this would not enter into the compu¬ 
tation. The transhipment problem computation would be reduced from M 
origins and M destinations to m origins and n destinations wherfe m + n = M. 

The situation is analogous to the relation between solution of simultaneous 
equations and matrix inversion. If one has a single set of simultaneous equations 
to solve, one does not invert the matrix. However, if there are many sets to solve, 
all involving the same matrix, then it pays to invert. Here, if one has a single 
transhipment problem to solve, one would not find all the c {j , but if there were 
many to solve, all with the same original c»/s, it would pay to find all the c*/s 
initially. 

The following is a small example of determination of an optimum linked 
path the shortest path from one point in a network to another. The network is 
shown in Figure 1 . We wish to find the shortest route from point A to point D. 
The direct route distances between each pair of points is shown in the diagram. 

The amounts to be shipped are: 

gi = 1 for point A 
02 = 0 for point B 
03 = 0 for point C 
04 = — 1 for point D 

As a fictitious stockpile to permit transhipment we can take s = 10 . The 
amounts to be shipped and received in applying the transhipment technique 
become: 

«i = 11 bx = 10 

a 2 =10 b 2 = 10 

as =10 b 3 - 10 

04 =10 b 4 = 11 

Table IIIA shows the “cost table”, using the distances from Figure 1 as the 
costs, and the marginal totals, a* and bi. Solution of a transportation problem 
based on Table IIIA yields the minimizing solution shown in Table IIIB. Upon 
deducting the diagonal values, Xu , in Table IIIB from the marginal quantities, 
the final results appear in Table IIIC as the series of links: A to C (distance = 4), 
and C to D (distance = 1 ). The path A-C-D, for a total distance of 5 , is the 
shortest route from A to D. 

In networks with a large number of points the linear programming solution 
should be practical even when the number of combinatorial possibilities is 
immense. 
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ON A CLASS OF CAPACITATED TRANSPORTATION 

PROBLEMS* 

HARVEY M. WAGNERf 
Stanford University 

Transportation models (ordinary and transhipment) having certain types 
of capacity constraints on the flows between origins and destinations are 
studied from the point of view of transforming them into enlarged standardized 
(non-capacitated) models. Specifically constraints on the flow from disjoint 
and/or nested sets of origins to all destinations, and from any single origin to 
disjoint and/or nested sets of destinations are considered. Dual formulations 
are indicated for constraints on the flow to destinations from origins. In the 
case of a set of capacity constraints on the flow from each origin to each destina¬ 
tion, the models proposed are easily seen to be of minimal dimension for any 
“standardized” version of such a capacitated problem. 

I. Introduction 

In this paper we consider techniques which transform transportation type 
problems subject to a certain class of capacity flow constraints into enlarged un¬ 
capacitated transportation problems. The simple capacitated Hitchcock prob¬ 
lem in which each flow Xij from origin 0% , i = 1,2, • * • , m, to destination Dj , 
j = 1, 2, * • ■ , 7 i, is bounded by a positive integer dj has been considered by 
several authors [1, 4, 6]. These previous papers have offered special and elegant 
algorithms for solving the problem; the methods are more involved than stand¬ 
ard transportation simplex routines but do have the distinct advantage of utiliz¬ 
ing a transportation tableau having a single row for each origin and a single 
column for each destination. Dantzig [3] has indicated how this model may also 
be viewed as an (ran + ra) row and (mn + n) column ordinary transportation 
problem. We permit a somewhat wider class of capacity constraints. In the 
special case of the simple model just mentioned, our method yields an ordinary 
transportation problem with (ran) rows and (ra + n) columns. 

There are two principal reasons why the approaches to be offered seem of in¬ 
terest. First, from the theoretical side, they provide (as does Dantzig’s scheme 
in the simple case) the connection between the standard and several types of 
capacitated models, and thus demonstrate that the additional restrictions can 
be viewed conceptually as merely straightforward extensions of the familiar 
model 1 . Second, on the practical side, they allow a capacitated problem be to 

* Received May 1958. 

t This work was supported in part by an Office of Naval Research contract. 

1 The motivating idea may be contrasted to that of “secondary constraints” [2, 5, 8, 9]. 
The latter technique suggests the use of a subsystem of constraints which may yield an 
optimal feasible solution to the entire model and is computationally amenable by virtue of 
its abbreviated size; the technique herein suggests the use of an enlarged system which 
does yield an optimal solution to the original model and may be computationally amenable 
by virtue of its special form. 

To demonstrate the legitimacy of the transformations, first one needs to show that any 
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solved by personnel familiar with the usual algorithm or by an automatic com¬ 
puter previously coded for the standard method without expending any new 
effort, provided of course that the dimensions of the enlarged problem do not 
become prohibitive. An ancillary reason for interest in the methods is that demon¬ 
stration of the equivalence relationships may lead to the formulation of special 
algorithms utilizing the m row and n column tableau as have been offered for the 
solution of the simple capacitated model [1, 6]. 

. 1x1 S^ 10118 n > In > 311(1 Iv we consider the usual transportation model sub¬ 
ject to a certain class of capacity constraints, the transformation schemes for 
conversion of the models, and applications to several examples. We extend our 
analysis to a transhipment model [7] in Sections Y and VI. 

II. Models of Capacitated Transportation Problems 

TT ?^ e _ ba f c transportation model underlying our discussion in Sections II 
ill, and IV is ’ 

m n 

(la ' minimize 23 23 d i3 x i} - 

y—i 

subject to the constraints 


(ib) 


n 


23 r,, ™ dx 

3 -1 


i = 1, 2, • • •, m 


(lc) 

(l d) 


23 xa ^ bj 

iwm\ 

Xi 3 ^ 0 


3 = 1, 2, • • •, n 


where the a< and b 3 are positive integers corresponding to the supply and de¬ 
mand at ongm 0, and destination D 3 , respectively, and x i3 is the shipment 
between these points at the unit cost da . F 

It is convenient to put (1) into the canonical form 


(2a) 

subject to the constraints 
(2b) 

(2c) 

(2d) 


m n 

minimize 23 23 d i3 x i3 

i=o y=o 


n 



Xij ^ 0 


* = 0,1, • • •, m 
3 — 0j 1, • • •, n 


th ® original ca P aci ^ted problem is also a feasible solution to the en- 
arged model and has the same cost; consequently the value of the optimal solution of the 
^ans onned problem is a lower bound to the value of the optimal 

motlL th • ° nemUSt demonstrate that given an optimal solution to the enWed 
cost va^e 6 " " C ° rreSp ° ndi ^ feasibIe *o the original model having the same 
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(2e) 

where 

(2f) 

( 2 g) 

( 2 h) 



We have added a fictitious origin Oo and destination D 0 which have the func¬ 
tion of providing for the possible excess of demand over supply and of draining 
the possible excess of supply over demand, respectively. Since either demand 
exceeds supply, supply exceeds demand, or they are equal, no more than one of 
oo and b 0 will be strictly positive. In an uncapacitated problem, the inessential 
fictitious location(s) would be omitted; in the transformations imposing capacity 
constraints, we shall find it convenient to assume that both fictitious points 
exist. Once all the transformations have been applied, if either oo or 60 (or both) 
equals zero, the corresponding location(s) may be eliminated from the model. 
We shall think of ( 2 ) as being arrayed in the familiar transportation tableau, 
each row of which corresponds to an origin and each colum n to a destination. 

Letting I and J be a set of row and column indices, we associate with the 
positive integer Ci ;J the capacity constraint 

(3) 22 12 Xij S Ci-J. 

i €I ] tJ 


Throughout this paper we consider constraining relations of the forms 

(i) multiple-row constraint: I contains more than one index and J = 

(ii) single-row constraint: I contains a single index and J is an arbitrary set 
of colu m n indices 

(iii) multiple-column constraint: J contains more than one index and I = 
{1, 2, ••• , m} 

(iv) single-column constraint: J contains a single index and I is an arbitrary 
set of row indices. 

The simple capacity constraint (on a single flow x^) falls into categories (ii) 
and (iv). We say that constraint Ck ; l is nested within constraint c I;J if / 2 K 
and J 3 L; we say that the two constraints are row (colu m n) disjoint if I and 
K (J and L) have no row (column) indices in common. 


III. Elementary Transformations 

In this section we show how each of the constraining relations above in con¬ 
junction with model ( 2 ) may be transformed into the format of a standard 
transportation problem. We further indicate a manner for transforming certain 
systems of the above constraints, some of which may be nested within others, 
into an enlarged problem in canonical form. 
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1A. One multiple-row constraint. Assume that I contains a collection of k in¬ 
dices. We create a single (fictitious) new origin Om+i and k new destinations 
fin+i, * • * , Dt+k , each associated with a particular i e I. We describe the rela¬ 
tion associating each iel to one of the new destinations by the notation j(i), 
and let 

(4a) Um+l — Cj;l f 2, ... , n 

(4b) &ici>= at iel. 

In order that the capacity constraint be truly binding 
(4c) 2 Of > Cl;J . 

UI 

If bo = 0, then total supply does not exceed total demand and (4c) would 
imply that a certain amount of infeasibility were being introduced by the con¬ 
straint. If bo > 0, then the new constraint may or may not create infeasibility. 
In any case we make the revisions 

(4d) a'o — ao — minimum (0, bo + c I;J — a») ^ 0 

iel 

(4e) V o = maximum (0, bo + c I;J — a d — 0 

UI 

and note that if a'o > ao, then we have formally avoided the problem of in¬ 
feasibility by creating an addition to the fictitious supply at Oo . 

Denoting an arbitrarily large positive number by M, we (i) revise the unit 
shipment costs so that 

(4f) dio — M iel 

(ii) assign a zero unit cost to the entries at the intersection of 0*+i with D 0 > 
Dn+i, * * * , Dt+k , (iii) assign a zero unit cost to the entry at the intersection of 
the i-th row, i e I, with , and (iv) assign the unit cost M to all other entries 
in the new row and columns. 

The relationship between an optimal solution to the enlarged problem, say 
Vij , and an optimal solution to the original capacitated problem is given by 


(5a) 

(5b) 

XiO 

iel 

[iel, 

j = 1, 2, • 

•* , n 

(5c) 

x%j — y%j 

[it I, 

tH 

o' 

II 

• * , n 


Note that 


(5d) x%j — Vm+ i,/(i) iel 

i—1 

and that the correspondence in (5) is such that all shipments to Dj , j = 1, 
2, • • • , ft, are found in column Dj of the optimal tableau of the enlarged prob¬ 
lem. 
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Constraint: c lt 2; 1 , 2 . 3. 4 

Figure 1 

An example of the transformation applied to a three origin and four destina¬ 
tion model is pictured in Figure 1. 

IB. A disjoint set of multiple-row constraints. If there are several multiple-row 
constraints, every pair of which is row disjoint, the transformation in (1A) may 
be applied successively for each constraint. 

IC. Nested multiple-row constraints. If one multiple-row constraint is nested 
within another multiple-row constraint, then the former may be considered as 
being transformed by (5d) into a capacity constraint on flows from the new 
origin Ot+i to certain of the new destinations. Consequently a method for han¬ 
dling nested multiple-row constraints will emerge from our method for handling 
capacity constraints on entries within a single row. 

2A. A disjoint set of constraints on a single row 2 . If there are a set of h con¬ 
straints on a single row i, each pair of which is column disjoint, then we define 
h new origins 0 *+i, • * * , Ot+h , one corresponding to each constraint, and a 
single new destination Z)*+i. We describe the relation associating each constraint 
to one of the new origins by the notation i(J 1 ), l = 1, 2, • • • , A, and let 


(6a) 

a %Ui) = 


o~i 

II 

J- 4 

-,h 

(6b) 

h 

b n +l = S &i;Jl 

1 




We (i) revise 

the unit costs of shipment so that 




(6c) 

d- itj = M 

j eJi, 

9^4 

II 

H* 4 

JO 

• ,h 

(ii) assign a zero unit cost to the entries at the intersection of 0< , 0»+i , 
OZ+h , with Dt+i , (iii) assign the cost 

y 

(6d) 

11 

3* 

3 eJi, 

1=1,2,- 

■ ■,h 


and (iv) assign a unit cost M to all other entries in the new rows and column. 
2 The condition of one constraint on a single row is a special case of this category. 


32 


II-2 —DETEEMINISTIC DECISION MODELS 


Origins 

Destinations 

Availabilities 

Do 

Di 

Dt 

D* 

Da 

Dt* 

Oo 

0. 

0, 

0,* 

o <* 

0 

0 

0 

M 

M 

0 

M 

du 

du 

M 

0 

M 

du 

M 

du 

0 

1 M 

du 

M 

du 

is: a-S-° 

M 

0 

M 

0 

0 

do 

di 

ai 

Cl; 1 

Cl; 2. 3 

Demands 

6 0 

h 

b 2 

6* 

b 4 

Cl; 1 “f- Cl; 2. 3 



Constraints: c 1; t and c 1; 2 , 3 

Figube 2 



Figube 3 


The relationship between the optimal solution to the enlarged problem, y {J , 
and the optimal solution to the original capacitated problem is given by 


(7a) 

(7b) 


x%j = yij 


j e Ji> 
i %. 


I = 1, 2, • • • , h 


As in the multiple-row transformations, the correspondence in (7) is such that 
all shipments to By , j = 1, 2, • • • , n, are found in column By of the optimal 
tableau of the enlarged problem. 


An example of a single-row capacitated model with two origins and four des¬ 
tinations is pictured in Figure 2. 

2B. A disjoint set of single-row constraints nested within another single-row con¬ 
straint. The transformation (2A) is such that for each constraint c;. „ a new 
row i(Ji) is defined in which by (7a) there is a one-to-one correspondence be- 
ween the original , jeJ t , and the . Consequently a nested constraint 
, where by definition 4 c J ; , may be considered as being transformed 
into a constraint on a single-row i(Ji) and handled accordingly. Similarly a set 
of column-disjoint constraints nested within the constraint c- :Jl may be handled 
as a disjoint set of constraints on a single row i(J t ). 
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Origins 

Destinations 

Availabilities 

Do 

Di 

Da 

' 

Dis 

o, 

0 

da 

dn 

... 

d\, 12 

a* 

02 

0 

d 21 

d 22 

... 

d 2 , 12 1 

a 2 

0, 

0 

dn 

dm 

... 

dt’i » 12 

a 7 

Demands 

bo 

6i 

b 2 

... 

hi2 



Assumption: a 0 — 0 
Constraints 

1. Cl; 1, 2 

2. Cl; 3 

3. Ci; a, 7 

4. Cl; 8, 91 10, 11 

5 . Cl; 9, 10 , 11 

6. Cl; 10, 11 


7. C2, 3, 4, £; 1 

8. C2, 3; 1, 2, .. 

9. C4; 2, s, 4 
10. C 4 ; 2 

11* C4; 7, 9 
12. C4; 12 
Figure 4a 


2 , ... , 12 
, 12 


Figure 3 gives a two origin and four destination model with a nested con¬ 
straint. 

2C. A collection of sets of single-row constraints. If more than one origin has a 
set of single-row capacity constraints on the shipments therefrom, the transforma¬ 
tions (2A) and (2B) may be performed successively by taking, for example, one 
row at a time and by exhausting all of the constraints upon it before advancing 
to the next row. 

3. A collection of column constraints. It is left as an “exercise” for the interested 
reader to verify that single and multiple column capacity constraints may be 
handled by transformations analogous to the ones above by interchanging the 
r61e played by the additional fictitious origins and destinations (including Oq 
and Do). 

4- A collection of row and column constraints. As we noticed in the suggested 
transformations for single and multiple row constraints, the shipments to any 
destination Dy , j = 1 , 2, • * * , n, are always to be found in the enlarged problem 
in column Dy, and furthermore no “fictitious” shipments to Dy are introduced 
in the transformations 3 . Thus if a collection of row and column constraints is 
imposed on (2), we may proceed first to make all necessary row transformations, 
and subsequently to make all necessary column transformations. 

IV. Examples 

In Figure 4 we trace the development of a standardized transportation model 
stemming from a model with seven origins and twelve destinations, subject to 

* The reader will recall that in adding new origins, either the entry at the intersection 
of the new origin with Dy was assigned a unit cost of M, prohibiting the use of the route in 
an optimal (feasible) solution, or was assigned a unit cost da that had been “displaced” 
from another entry in Dy where the unit cost had been changed to M. 
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Origins 




Destinations 




Availabilities 

Do 
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Dts 

Dfo 

A*7 



Oi 

0 

M 


M 

M 

M 

M 

M 

<Zi 
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M 

d 21 


M 

0 

M 

M 

M 
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03 

M 
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M 

M 

0 

M 

M 

a z 

0 4 

M 

dn 


M 

M 

M 

0 

M 

a 4 

0 6 

M 

dbi 


M 

M 

M 

M 

0 

a& 

On* 

M 

M 


0 

M 

M 

M 

M 

Cl; 10. 11 

On* 

0 

M 


M 

0 

0 

0 

0 

C 2. 3. 4. 5; 1. 2, 

— . 12 

Demands 

6'o 

61 

... 

iis 


6n 

(u 

biz 



Constraint Number 7 
Under Assumption: 

b f 0 — bo C 2 , 3 , 4 , 5 ; 1, 2, .... 12 — (d 2 -f* flg + CI4 H“ <*&) 0 

&16 = 

&17 = dz 
bis = ^4 
619 = 05 

Figure 4d 


Origins 
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Availabilities 

Do 

Di 



Dto 

Dli 


d* 9 

DZ, 

Oz 

0 

M 


M 

M 

M 

M 

M 

M 

ai 

On* 
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M 


0 

M 

M 

M 

M 

M 

Cl; 10. 11 

On* 

0 

M 


M 

M 

M 

0 

0 

0 

Ci . 3, 4. 5; 1. 2, 











.... 12 

On* 

M 

M 


M 

0 

0 

M 

M 

0 

C 2, 3; 1 . 2. .... 12 

Demands 

b'o 

6i 

... 

(it 

bis 

&17 

bn 

bu 

b 20 



Constraint Number 8 

620 = C 2 , 3; 1 . 2 . 12 

Figure 4e 


twelve row constraints. In Figure 4a we give the model (2) and a statement of 
the twelve constraints. We first use transformation (2A) to account for con¬ 
straints Number 1, 2, 3, and 4, Figure 4b. Constraints Number 5 and 6 are 
added by transformation (2B) in Figure 4c. The multiple-row constraint Number 
7 is added by transformation (1A) in Figure 4d, and the nested multiple-row 
constraint Number 8 is included by transformation (1C) in Figure 4e. Trans¬ 
formations (2A) and (2B) are used to impose constraints Number 9, 10, 11, 
and 12, Figure 4f. 
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Origins 

Destinations 

Availabilities 

Do 

Di 

D% 

Da* 

Di* 

Oo 
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0 

0 

M 

M 

Oo 

Ox 

0 

M 

M 

du 

dn 

Oi 

O t 

0 

M 

M 

dn 

0?22 

02 

Oi 

0 

dzi 

dz 2 

M 

M 

Oz 

Ox 

0 

du 

di2 

M 

M 

04 

Ox* 

M 

0 

M 

0 

M 

Cl. 2 ; 1 

Ox* 

M 

M 

0 

M 

0 

Ci. 2 ; 2 

Demands 

bo 

bi 

62 

Cl, 2 ; 1 

Cl, 2 ; 2 



Constraints: Ci, 2; 1 and Ci, 2; 2 

Figure 5 


A second example having four origins and two destinations is explored in 
Figure 5. Suppose the origins Oi and 0 2 supply a somewhat “inferior” item to 
that available at the remaining origins, and as a consequence, each destination Dj 
must not receive more than Ci, 2 ;y of product from Oi and 0 2 . We thus have a 
set of single-column capacity constraints leading to the enlarged model in Figure 
5. 

As a third illustration, we consider the simple capacitated Hitchcock problem. 
Since a capacity on a single Xij may be viewed as a single-row or a single-col¬ 
umn constraint, we have two types of transformations from which to choose. 
Figure 6 shows a two origin and three destination model and the associated en¬ 
larged problems under the alternatives that the simple constraints are handled 
as row or column limitations. Note the condition that each Xjj is constrained 
allows our modifying the tableau resulting from transformation (2C) so as to 
eliminate the original set of Oi , i = 1, 2, • • • , m, or Dj , j = 1,2, , n. In 

general if the capacities are viewed as either single-row or single-column con¬ 
straints, the resultant standardized model consists of a total (m + mn + n + 2) 
rows and columns 4 . This number may be reduced by one as either Oo or bo (or 
both) equals zero. It is noteworthy that both the row and column formulation 
are of minimal dimension for a standardized simple capacitated transportation 
model as the underlying mathematical model consists of (m + mn + n + 1) 
restrictions 5 . 

* In the single-row interpretation, we are assuming 

n 

2/ C%j Ss* Oi • 

If the opposite inequality holds, we revise a» to 

ofi Ctj 

j-1 

and eliminate destination Dl +i . A similar remark holds for the single-column mterpreta- 

5 As we noted previously, Dantzig’s transformation results in a standardized mode) 
having a total number of rows and columns of the order (m + 2 mn + n). 










38 


II-2—DETERMINISTIC DECISION MODELS 



Simple Capacitated Hitchcock Model-Single Row Interpretation 


Figure 6 a 



Figure 6 b 


V. Model of a Capacitated Transhipment Problem 

Orden [7] has considered an extension of the transportation model to allow 
for the transhipment of the resource through locations, i.e., a location may both 
receive md ship amounts of the resource. We assume that there are locations 

■° ’ •£ i’ I\ ’. P ’ t0 whlch are associated integral numbers r„ ; a positive r„ 
signifies that location g has that net amount of available resource for shipment 
elsewhere, and a negative r. signifies that location g has that net requirement of 
the resource to be shipped from elsewhere. For the sake of simplicity of exposi¬ 
tion, we assume that a transhipment may be made through any L„ • the reader 
may vent, that if some locations ar e restricted to be origL oriy IndoSt® 
be destinations only, the transportation tableau is easily altered by removing 
the corresponding columns and rows, respectively. 
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Analogous to (2f) and (2g), we introduce an artificial location L 0 with an as¬ 
sociated 

p 

(8a) r 0 = —X r o ■ 

gmml 

That is, if the total available resources exceed the total requirements, location 
Lo has a net demand for the excess; and if total requirements exceed the total 
available resources, Lo provides a fictitious supply for the deficiency. The mathe¬ 
matical model may be summarized as 

P V 

(8b) minimize X) d 0 * a x g *g 

g'mmQ gam Q 


subject to the constraints 


(8c) 

V P 

^ ^ Xgg* X Q r o = ^g p = 0, 1, * * * , P 

g'mmQ —0 

(8d) 

Xg'g 0 

where 


(8e) 

dg'g = 0 if g = g' 

(8f) 

dQg — dgf 0 = 0. 

Letting 


(9a) 

r g + = maximum (0, r a ) ^ 0 

(9b) 

we define 

r g ~ = minimum (0, r 0 ) S 0 

(10a) 

S = E r„ + . 

0 -*o 

We may interpret $ as the maximum amount of the resource which would ever 
be transhipped through any location. 6 Following Orden’s presentation, we trans¬ 
form (8) into the following standardized transportation model 

(10b) 

p p 

minimize dijXij 

t*=0 jam 0 

subject to the constraints 

(10c) 

P 

^2 x Qj ~ maximum (0, ro) = do 
o 

(10d) 

p 

x h = s + r* + = a* i = 1, 2, * • * , p 

jamQ 


8 More specifically, we a te assuming that s represents the total amount of the resource 
available in the entire system. If there are additional buffer stocks which may be used in 
transhipments, then s should be increased by the corresponding amount. 
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(lOe) 


V 

= —minimum (0, r 0 ) = 6 0 


(lOf) 


V 



(!0g) x i} ^ o 

An example of a four location model is exhibited in Figure 7. 


VI. Transformations for the Transhipment Model 

Having put the transhipment problem into a standardized model (10), we are 
ready to examine to what extent the techniques given in Section III may be 
applied. Clearly from a formal point of view, there are no difficulties in execut- 
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0 
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da 
da 4 
du 
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max (0, r 0 ) 

8 4- r x + 

8 + r 2 + 

8 -f r s + 
s + r 4 + 
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m i * 

—min (0, r 0 ) 

s - rf 

s — r<r 

8 - ri~ 

8 - rr 



Transhipment Model 


Figure 7 



Figure 8 
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ing the standardizing methods. But our definition of a multiple-row or multiple- 
column constraint (viz., one that extends over all columns or rows, except Oo 
and Do) is probably of little practical significance in the transhipment model as 
the constraint would be taken to include the fictitious shipments x gg from a loca¬ 
tion to itself. The single-row and single-column transformations do apply without 
any loss of significance in meaning. 

In the special case of a simple capacitated transhipment model (in which 
there is a capacity constraint on each flow , i 9 ^ j) the model may be de¬ 
scribed by a transportation tableau having a total of (p 2 + 1) rows and columns, 7 
which is of minimal dimension for a standardized capacitated transhipment 
model. Figure 8 illustrates the tableau for a four location example. 
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A SUGGESTED COMPUTATION FOR MAXIMAL MULTI 
COMMODITY NETWORK FLOWS* 

L. R. FORD, JR. and D. R. FULKERSON 
The RAND Corporation, Santa Monica, California 

A simplex computation for an arc-chain formulation of the maximal multi- 

“ f T P roblem is Proposed. Since the number of variables 

m this formulation is too large to be dealt with explicitly, the computation 
treats non-basic variables implicitly by replacing the usual method of de- 
r ining a vector to enter the basis with several applications of a combin- 

netwolk g a Sh ° rteSt Chain j0ining a P air of P° ints io a 


l- introduction 

A problem of some importance in applications of linear programming is the 
determination of mammal multi-commodity flows in network. For example 

tZcos d rr probiems which have been p*****! recently 

in if - ab t!f their , studies of communication networks [5] can be cast 
n this form Straightforward application of the simplex method to such prob¬ 
lems is usually not feasible, since even small networks may generate linear pro¬ 
grams winch are too large for present machine capacity. What is needed are 

probkm S ed ForX Utmg l SChemeS ^ ^ advantage of the structure of such 
H 3 ThmTh ? C0m “ 0dlty CaS6 ’ Vari0US ea ^ computations are known 
[1, 3, 4] but the multi-commodity problem has remained relatively unexplored 

fln “ tl0D ° f S ™ p e exam P les makes it appear that the multi-commodity 

Cm2 “ COnSlderabIy .T re COm P lex tban the single commodity ont 
Certainly the nice combinatorial features of the single commodity case are lost 

he generalization-simplex bases (for any formulation of the problem known 

sucWnhl tnaagular ’ h , enCe addition and subtraction do not suffice to solve 
such problems by the simplex method, the max flow min cut theorem true for 

hhpF COmmodlt ^ networks ’ 1S false PI> and no simple-minded modification of the 
labeling process [4] seems to work. 

t h Ji/T 13086 f thiSa ° te is t0 suggesfc a computation which makes some use of 
the structure of one formulation of the multi-commodity problem within the 

" ° f a SlmPl6X Camputation - For this Particular formulation, the matrix 

anSl r P r am - 1S ^ mdd T e matm ° f arCS VS - a11 joining sources 
^ T US commodlties > and thus the number of variables is too 
large to be dealt with explicitly. The suggested computation treats non-basic 

r the 7 PlaCi ? 1,16 'W**' <*—*> * *■ 

U-e. the determination of a vector to enter the basis) with several applications of 

al8orithm fcr * *«*- 


* Received October 1957. 
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2. Arc-chain Formulation 

Let Ai , * • • , Am be a list of the arcs of the network, Ci, * • * , C n a list of all 
chains that join, for the various commodities, all the sources for a commodity 
with all sinks for the same commodity, and let A = (a r8 ) be the m X ti incidence 
matrix of arcs vs. commodity chains: 

[lifC* contains A r , 

(1) a rs = t . 

[ 0 otherwise. 

Thus, for example, if the network is that of Fig. 1, with sources Pi, P *, sink P 3 
for one commodity, and source Pa , sink Pi for a second commodity, the matrix 
A is as shown in Fig. 2. 

If we let x 8 , 5 = 1, • • • , n, denote the amount of commodity flow along C ,, 
and b r the flow capacity of A r , then the multi-commodity maximal flow prob¬ 
lem is represented by the linear program: 

n 

(2) maximize ^ x 8 

8=1 

subject to the constraints 

n 

(3) X VtsX, + Xn+r = , x i, • • • , X n+T ^ 0. 

8=1 

The assumption in (2) that commodities are valued equally is not essential 
to the method we propose, as will be clear from our discussion in the following 
section. Another thing we wish to point out is that it is immaterial whether the 



Commodity 1 Commodity 2 


Fig. 2 
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problem involves directed or undirected arcs. Thus, for example, if there are 
“one-way streets,” or if, in a communication network, say, it is desired to place 
an upper bound on the number of messages that can be transmitted from P, to 
Pj , and an upper bound on the messages that can be sent from Py to P,- , one 
considers two arcs, one from P,- to Py , the other from Py to P,- , and directed 
chains from sources to sinks. 

Since the number of chains is usually very large in practical applications, 
the arc-chain formulation of the problem might seem to be impossible to deal 
with computationally. Indeed, the enumeration of all chains from commodity 
sources to sinks in a network of moderate size would be a lengthy task, to say 
the least. Fortunately, there is no need to write down the entire matrix A, since 
the selection of a variable entering the basic set at any stage of the simplex 
computation (or the recognition that a basis is optimal) can be accomplished 
without explicit knowledge of the non-basic column vectors of A. All we need 
is the basis B = (b ri ) (or its inverse), a square submatrix whose order is the 
number m of arcs in the network, to compute the simplex multipliers 
ocr(r = 1, ■■■ ,m) satisfying, for j = Si, • • • , s m , 


(4) 


dr 6 r y — 


if j ^ n 
if j > n. 


We can then find a vector to bring into the basis (or prove that the current 
basis is optimal) by the method of the next section. Once such a vector has been 
found, determination of the vector leaving the basis is accomplished in the 
usual way. 


3. A Shortest Chain Algorithm 

Suppose we have computed the ov in (4) corresponding to a particular basis 
B. If some a r is negative, then the variable x n+r may be introduced into the 
basic set with possibly an increase in the form (2), that is, the unit vector having 
1 in the r-th position, zeros elsewhere, can be brought into the basis. (It may be 
that this vector also represents a one-arc chain for some commodity; in this case, 

a bigger increase in (2) might result, of course, by taking the latter interpreta¬ 
tion.) 

Assume, therefore, that a stage has been reached in the computation where 
all a r are non-negative. In this case, the algorithm described below, which malrw 
no use of the full incidence matrix A, can be used either to locate a column vector 
of A (i.e. a commodity chain in the network), that may be brought into the 
basis, or to prove that the current basis is optimal. 

Let us interpret the a r as lengths of the arcs. We wish to find a C s if 
one exists, whose length ’ 


2 . L/ &r O'rs 
r—1 

is less than one, the coefficient of *. in (2). Thus, it suffices to locate, for each 
commodity, a shortest chain from the commodity sources to its sinks. If each 
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of the chains thus selected has length at least one, the basis is optimal. Other¬ 
wise, a column vector of A corresponding to one of these chains may be intro¬ 
duced into the basis. 

The problem of locating a shortest chain from one set of nodes to another set 
of nodes in a network can be reduced to a standard transshipment problem [ 6 ], 
and may consequently be solved in various simple ways; see [ 2 ] and [ 6 ], for 
example. The algorithm we describe is that of [ 2 ], (In [ 2 ], the problem is considered 
to be that of finding a shortest chain from one node to another; to reduce our 
problem to this one, simply join each node of the first set to a new node by an 
arc of length zero, and similarly for the other set. We shall give a description 
which does not involve this device explicitly, however.) 

Let the set of sources for one commodity be S, the sinks for the commodity T , 
and suppose the nodes of the network are Pi, • • * , P N . Let kj denote the length 
of the arc joining Pi and Pj , i.e. if the arc A r joining P* and Py corresponds 
to the simplex multiplier a r , set kj = a r . (If arcs are directed, then we let kj 
denote the multiplier corresponding to the arc from Pi to Pj , hence in this case 
we may have kj ^ Zy*, whereas in the undirected case, Z*y = Zy;.) Initially assign 
to each node P* a number m as follows: 

fO for Pi € S 

Ti = \ 

{<*> otherwise. 

Now scan the network for an arc PiPj such that 

Ti kj < Tj . 

Replace 7 ry by m + kj if such an arc is found. Continue this process. Eventually 
no such arcs can be found; then the number represents the length of a shortest 
chain from S to Pi , for all i. In particular, the smallest m , for P* e T, is the 
length of a shortest chain from S to T. Let tt* be the smallest such. To find a 
chain from S to T of length Tk , look for an arc PjPh such that 717 + Zy* = tt* , 
then search for an arc P;Py such that 7 r t - + kj = Tj , and so on. Eventually a 
node of S is reached, and the desired chain has been traced out (in reverse). 

If in the process of locating shortest chains from commodity sources to sinks, 
for the various commodities, one is found of length less than one, we recommend 
that the corresponding column vector of A be introduced into the basis immedi¬ 
ately, rather than repeating the shortest chain algorithm a number of times 
in order to use the usual criterion for selection of a vector to enter the basis. 

We point out that the reason for getting rid of negative multipliers a r before 
using the shortest chain algorithm is that the algorithm may not work if arcs 
have negative lengths. 

To start the simplex computation, one can of course begin with the basic 
variables x n +i , * * * , x n+T , corresponding to the zero flow. 

4. Concluding Remarks 

Except for hand computation of a few small problems, we have no computa¬ 
tional experience with the proposed method. Whether the method is practicable 
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for a problem involving say, 50 nodes, 100 arcs, and 20 commodity source-sink 
sets oi, 1 1 , • • • , 020 , 120 , is a question which can be settled only by experi¬ 
mentation. It would certainly be more practicable in this case than straight¬ 
forward application of the simplex method to a node-arc formulation of the prob- 
em since in the latter formulation there would be roughly 1100 equations in 

in +k Vamb < f’ and beD , ce the basis matrices would be much too large, whereas 
m the suggested method, the basis matrices would be 100 X 100, and at most 20 
applications of the shortest chain algorithm would be necessary on each simplex 
iteration. How many simplex iterations might be required is another matter 
though. The incidence matrix A for such a problem could have many thousands 
of columns. On the other hand, there would probably be many column vectors of 
dominated by others, in the sense that, for a given commodity (or for different 
commodities m the equal value case), if one chain C is a subset of another chain 

be lgn f ed - (For “stance, the chain C, of Fig. 2 dominates C s 
and C 9 .) The shortest chain method takes care of such dominances automatically. 

A more serious consideration is how to handle the case of limited supplies of 
commodities in such a problem. For example, suppose that in the two com¬ 
modity maximal flow problem corresponding to the matrix of Fig. 2, there is an 
amount of commodity 1 at P ,, an amount a* of commodity 1 at P 2 , and 
an amount a, of commodity 2 at P t . We can reduce this to a problem of the same 

Tfrl P'°\ e p thrGe Dew ***** arcs and n °des as follows: 

A' 1 Z P' * p * 5 Afr ° m P ' 210 P * ™ tb ca P acit y 0,, and 

P ^ pT 4 CapaClty * • We then teke F 'i • P** as sources for com- 
]flr d ty and P « f 8 *® source for commodity 2. However, in the hypothesized 
large network with 20 commodities, the number of such new arcs would be 

n< 1S * he nu L mber of n odes in S <, and since each new arc increases 
the size of basis matrices by one, this might take the problem out of range of 
present computing machines. g 
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A NETWORK FLOW COMPUTATION FOR 
PROJECT COST CURVES* 

D. R. FULKERSON 

Mathematics Division , The RAND Corporation 

A network flow method is outlined for solving the linear programming prob¬ 
lem of computing the least cost curve for a project composed of many indi¬ 
vidual jobs, where it is assumed that certain jobs must be finished before others 
can be started. Each job has an associated crash completion time and normal 
completion time, and the cost of doing the job varies linearly between these 
extreme times. Given that the entire project must be completed in a prescribed 
time interval, it is desired to find job times that minimize the total project 
cost. The method solves this problem for all feasible time intervals. 

Introduction 

A linear programming problem of some practical importance that has been 
formulated by Kelley and Walker [6, 7] involves computing the cost curve for a 
“project” composed of many individual “jobs” or “activities.” Here a project is 
a partially ordered set of jobs, the partial ordering arising from technological 
restrictions that force certain jobs to be finished before others can be started. It 
is assumed that each job has an associated normal completion time and a crash 
completion time, and that the cost of doing the job varies linearly between these 
two extreme times. Then it would be desirable to calculate the least project cost, 
given that the entire project must be completed in a prescribed time interval. 
This would yield one point on the project cost curve. Solving the problem for all 
feasible time intervals produces the complete project cost curve. With this in¬ 
formation at hand, the project planner can answer either the question posed 
above, or the related question: given a fixed budget, what is the earliest project 
completion date? 

We shall show that the project cost curve can be easily computed using net¬ 
work flow theory 1 [1, 2, 3, 4, 5]. 

1. The Project Network 

There are at least two rather different ways of depicting the project as a 
directed network or linear graph. For example, suppose the project consists of 
jobs 1, 2, 3, 4, 5 and that the only order relations are: 

1 precedes 3, 4 

2 precedes 4, 

3, 4 precede 5, 


* Received June 1960. 

1 After the results of this paper were obtained, the author learned that a network flow 
approach to the project cost problem has also been developed by Kelley. See [7], which 
contains a statement to this effect, and [8] for a complete exposition of Kelley’s method.” 
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and those implied by transitivity. The usual way of picturing this partially 
ordered set is shown in Fig. 1, where nodes correspond to jobs and directed arcs 
to the displayed order relations. Another way is shown in Fig. 2, where some of 
the arcs represent jobs, and the nodes may be thought of as events in time; 
the existence of a node stipulates that all inward pointing jobs at the node must 
be completed before any outward pointing job can be started. Notice that the 
second of these two representations of the project uses an arc (the dotted one 
of Fig. 2) not corresponding to any job. This need cause no concern, since a 
dummy job can be added to the project to correspond to such an arc, and the 
assumption made that such fictitious jobs have zero completion time and zero 
cost. It is not difficult to see that allowing dummy jobs permits such a network 
representation for any project. Indeed, one could merely take the kind of network 
shown in Fig. 1, replace each node i by a pair of nodes z', i" and add the directed 
arcs (i , i ) (from z v to i f/ ) to the network. An old arc (z, j) becomes the arc 
/) in the new network; these latter are now dummies. But this is not, in 
general, efficient in terms of the number of nodes and arcs. 

Using either of these two network representations of the project, the problem 
of computing the cost curve can be shown to be a flow problem. We shall work 
with the second representation. Thus we take as given a directed network in 
which arcs correspond to jobs and nodes to events. This network contains no 
directed cycles. We may also assume, by adding “beginning” and “terminal” 
nodes (events), if necessary, together with appropriate arcs pointing out from 
the beginning node and into the terminal node, that each arc is contained in some 
directed chain from the beginning node to the terminal node. Finally, we may 
assume, since the network contains no directed cycles, that the nodes have been 
numbered 1,2, • * • , n in such a way that 1 is the beginning node, n the terminal 
node, and if (z, j ) is an arc, then z < j. 

2. The Project Cost Curve Program and Its Dual 

. Associated ^th each arc (h j) of the project network are three nonnegative 
integers: a(z, j), b(i, j), c(z, j), with 

a (hJ) S 
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Here the interpretation is that a(i, j) is the crash time for job ( i , j), b(i , y) the 
normal completion time, while c(i, j) is the decrease in cost of doing job (i, y) 
per unit increase in time from a(i, j) to b(i, j ). In other words, the cost of doing 
(£, y) in r(i, j) units of time is given by the known linear function 

(2.2) *(i, j) - c(i, j )r(i, y) 
over the interval 

(2.3) o(i,y) ^ r(t,y) £ b(i,j). 

Then the problem is, given X units of time in which to finish the project, to 
choose, for each job (i, j), a time r(z, j) satisfying (2, 3) in such a way that the 
resulting project cost 

(2.4) X IKi, j) ~ c(i, j) t(«, i)] 

i,3 

is minimized, or equivalently, the function 

(2.5) 

i,3 

is maximized. Thus, letting r(i) be the (unknown) time of occurrence of event 
we wish to maximize (2.5) subject to the inequalities 

(2.6) r(f,y) + r(i) - r(j) ^ 0, all (t,y), 

(2.7) -r(l) + r(n) ^ X, 

(2.8) r(i,y) ^6(i,y), all(i,y), 

(2.9) — r(i,y) ^ all (i,y). 

Here (2.6) expresses the condition that there must be sufficient time between 
the occurrences of events i and j to do job (z, y) in r(z, y) units of time, and 

(2.7) says that the project duration time is at most X. 

The project cost P(X) corresponding to the assigned value of X in (2.7) is 
given by 

(2.10) P(X) .= J^k(i,j) - max ^c(i, j)r(i, j), 

i,j i»3 

the maximum being taken over all r(i, y), r(z) that satisfy the constraints. 
Here we assume that the constraints are feasible, which will be the case for 
sufficiently large X. Indeed, for given r(z, j) satisfying (2.8) and (2.9), the con¬ 
straints are feasible if and only if X is at least equal to the “r-length” of a longest 
directed chain from 1 to n, that is, X must be at least equal to the maximum of 
^r(z, j), the summation being along a directed chain from 1 to n, and the 
maximum being taken over all such chains. The proof of this relies on the fact 
that the project network contains no directed cycles. 

Dummy jobs may be assumed to have lower bounds a(i, j) = 0, upper bounds 
b(i, j ) = 0, and costs c(z, j) = 0 in this linear program. To construct the project 
cost curve P(X), we need to solve the program (2.5)-(2.9) parametrically in X. 
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This formulation of the project cost curve prpgram has been given by Kelley 
and Walker [6, 7]. 

It may be observed preliminarily that P(X), which is well defined for some 
X-interval, is convex. For if Xi, X 2 are two given values of X that make the con¬ 
straints feasible, and if r x (i, j), r 2 (i, j), r 2 (z) represent optimal solutions 

to the two corresponding programs, then averaging these two solutions gives a 
feasible solution to the constraints corresponding to the X-value §(Xi + X 2 ). 
Hence, since we are minimizing P(X), 

P»(Xi + X 2 )] £ |P(X x ) + fP(X 2 ). 


In addition, P(X) is piecewise linear, as will be apparent later on. 

We may set r(l) = 0, since adding a constant to all event times does not 
alter the program. With this normalization, it follows from (2.6) that all r(i) 
are nonnegative, since the job times are nonnegative by (2.9), and since each 
node is contained in some directed chain from 1 to n. 

Let us examine the dual of the project cost program. If we assign nonnegative 
multipliers v, g(i,j), to the constraints (2.6), (2.7), (2.8), (2.9) 

respectively, the dual of the program, for fixed X and r(l) =0, has constraints 


(2.11) + g(i,j) - h(i,j ) = all 


( 2 . 12 ) 


E L - f(j,i )] = 


0, i ^ 1, n, 
— v, i = n, 


subject to which 

(2.13) \v + Hb(i, j) - 'E, a(i, j)h(i, j) 

t.y i,3 


is to be minimized. Here, we repeat, all variables are nonnegative. Equalities 
appear in the constraints since variables of the primal program are not explicitly 
restricted in sign. 

It follows immediately that at least one of g(i, j ), h(z, j) can be taken zero in 
an optimal dual solution, and hence we may assume 

(2.14) = max[0, c(i, j) 

(2.15) h(i,j) = max[0, f(i, j) - c(i,j)]. 

Thus the dual problem becomes: find nonnegative numbers/(i, j), one for each 
arc of the project network, and a nonnegative number v , that satisfy the equations 
(2.12) and minimize the nonlinear function 

(2.16) + E b(i, j) max [0, c(i, j ) - f(i, j )] 

~ E a(h j) max [0, f(i, j) - c{i, j)]. 

i,3 

A key observation at this point is that a function of the form 

(2.17) b max (0, c — /) — a max (0. f — c) 



II-4—PROJECT COST CURVES 


51 



(sketched in Fig. 3) is convex, and of course, piecewise linear. The convexity of 
(2.17) follows from the assumption a Sb. Thus, even though (2.16) is nonlinear, 
it is the next best thing (for minimizing), namely a sum of piecewise linear, 
convex functions of the individual variables /(z*, j ). As is well known in linear 
programming, such a function can be dealt with by linear methods. 

Here one replaces each/(z, j) by a sum of two nonnegative variables, say, 

(2-18) 1) + f(i, j; 2), 

the new variables being subject to the upper bound or capacity constraints 

(2.19) f(i,j;l)^c(i,j), 

(2.20) 2) g co. 


Then f(i, j; 1) has coefficient —b(i, j), f(i, j; 2) has coefficient —a (i,j) in the 
new linear minimizing form. Thus, if we define 


( 2 . 21 ) 

( 2 . 22 ) 


k) = 


a (h j ; k) = 



k = 1, 

00 

k = 2, 


k = 1, 

.<*(*> j)> 

k = 2, 


the dual program has constraints 


(2.23) 

(2.24) 




0 , i 1, n, 

v , i — n , 


o ^ A) ^ c(i,j; k), 


and minimizing form 

( 2 . 25 ) Xv - X) a(i, j] W(.i i; A). 

i,j,k 


This program has the following network flow interpretation. First enlarge the 
project network by doubling the number of arcs: corresponding to each arc 
(z, j) of the project network there are now two arcs (z, j; 1) and (z, j; 2) from 
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i to j (see Fig. 4). Each arc (i, j; k) of the new network has an assigned flow 
capacity c(i, j; k). The problem is to construct a flow/(i, j; k) from source node 
1 to sink node n in the new network that minimizes (2.25). 2 

Flow problems of this kind have received a great deal of study in recent years, 
and computational methods far superior to general linear programming methods 
are known for such problems [2, 3, 4, 5]. In the next section we shall describe an 
efficient flow algorithm for generating the complete project cost curve. This 
algorithm will start with the largest X of interest, namely X equal to the maximal 
6-length of directed chains from 1 to n. The algorithm then determines 
sequentially a finite set of values of X that contains all breakpoints of the convex, 
piecewise linear P(X). Corresponding to each of these X-values, certain node 
numbers r(i) are produced. We denote them by r(i) because they do indeed 
have the interpretation of optimal event times in the original project program. 
Here we shall have r( 1) =0, r(n) = X. Then optimal job times r(i, j) for the 
project, corresponding to these r(i), are given simply by defining 

(2.26) r(i,j) = min [b(i,j), r(j) - r(i)]- 

We shall discuss these assertions in more detail following the algorithm 
statement. 


3. A Flow Algorithm for Determining P(X) 

The basic routine used in the algorithm is a labeling process in which labels are 
assigned to some of the nodes. In general, the labeling process is a systematic 
search for a path (having certain desired properties) from 1 to n. Here the word 
“path,” as opposed to “directed chain” or “chain,” means that arcs can be 
traversed against their orientations in going from 1 to n. Such arcs are termed 
reverse arcs of the path; the others, traversed with their orientation, are called 
forward arcs of the path. 

We enter the labeling process with an integral flow / = f(i, j ; k) and node 
integers r(i) that satisfy 

(3.1) r(l) = 0, 

2 A function / from arcs to nonnegative reals that satisfies equations (2.23) for some num¬ 
ber v , and also satisfies the capacity constraints (2.24) on individual arcs, is called a flow 
from 1 to n of value v. The left-hand side of (2.23) is the net flow out of node i. Thus the 
net flow out of nodes 2, • * * , n — 1 is zero, the net flow into n is v , and it follows that the 
net flow out of 1 is v. 
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(3.2) a(i,j;k) + r(i) — t(J) < 0=»/(i,i;fc) = 0, 

(3.3) a(i,j ; k) + r{i) — r(j) > 0 k) = c(i 9 j; k). 

To shorten the notation, we shall set 

(3.4) a(i,j;k) = a(i,j;k ) + r(i) — r(j). 

(The properties (3.1), (3.2), (3.3) are optimality properties for the program 
(2.23), (2.24), (2.25) corresponding to X = r(n). That is, for X = r(n), these 
properties imply that the flow / minimizes (2.25). 3 The labeling process termi¬ 
nates in one of two ways, called “breakthrough” and “nonbreakthrough.” 
Breakthrough means that the node n has received a label, and this in turn means 
that a path from 1 to n has been found having the properties that for all forward 
arcs of the path, a(i, j; k) =0 and f(i, j; k) < c(i, j;k), whereas for all reverse 
arcs of the path, a(i, j; k) = 0 and f(i, j;k ) > 0. Then the old flow / is changed 
by adding a positive integer e to the amount of flow in all forward arcs of the 
path, and subtracting e from the flow in reverse arcs of the path. This yields a 
new integral flow f from 1 to n, for which the optimality properties (3.2), (3.3) 
hold. Nonbreakthrough, on the other hand, means that the labeling process has 
terminated and node n has not been labeled. In this case, the node integers are 
changed by subtracting a positive integer 8 from all r(i) corresponding to un¬ 
labeled i. This doesn’t change r(l) =0 but reduces r(n) = X and thus defines a 
new set of optimal job times, via (2.26), corresponding to this new value of X. 
The node number change 8 is selected in such a way that the new node numbers 
r'(i) and old flow/ still satisfy the optimality properties (3.2), (3.3). 

After either a breakthrough or a nonbreakthrough, the labeling process is 
repeated. 

The project cost P(X) is linear between successive values of X produced by 
nonbreakthroughs, as we shall see. 

The labeling process described below has been divided into two parts, called 
first and second labelings, respectively. The first labeling seeks a directed chain 
from 1 to n composed of infinite capacity arcs (those corresponding to k = 2) 
such that a(i, j; 2) =0 for each arc of this chain. If such a chain is found, then 
the computation terminates, for the existence of such a chain means, in effect, 
that any further decrease in X would make the project program infeasible. If no 
such chain can be found, we go on to the second labeling, in which the search is 

3 This follows from the linear programming duality theorem, or can be seen directly by 
observing that 

X) a(t\ j; k)f(i, j; k) = 2 J/ *)/(», j> W + 23 K*) “ r(j)]f{i, j; k) 

h j,l i,j,k i,j t k 

= 23 a(i, j; j; k) + [t( 1) - r(n)]u 

= 23 J; j, k) - \v. 

i,j>k 

Then (3.2) and (3.3) imply that / maximizes the left-hand side of this equality, hence, the 
right. 
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extended in an attempt to find a path from 1 to n having the properties pre¬ 
viously outlined. 

The computation is initiated by the Start routine below, which generates a 
starting set of node numbers that, in conjunction with the zero flow, satisfy the 
optimality properties (3.1), (3.2), (3.3). This routine is nothing more than a 
way of finding a chain of maximal 6-length from 1 to n, and has been used by 
Kelley and Walker to compute what they term “critical paths” [6]. (Their usage 
of the word “path” corresponds to our usage of the word “chain” or “directed 
chain.”) 

We now describe the algorithm in detail. 

Start Successively compute r(l), r(2), * * • , r(n) by the recursion 

(3.5) r( 1) = 0 


r(j) = max [r(£) + a(i,j; 1)] = max [r(£) + &(£,/)]. 
i i 

Set all /(£,/; k) = 0. 

Iterative Procedure . Enter with an integral flow/ = k) and node integers 
r(i) satisfying (3.1), (3.2), (3.3). Initially these are the ones generated by the 
Start routine. 

A. Labeling Process . During this routine, a node is considered to be in one of 
three states: unlabeled, labeled and unscanned, or labeled and scanned. Initially 
all nodes are unlabeled. 

1 . First Labeling. Assign node 1 the label c(l) = <*>]. (This node is 

now labeled and unscanned; all other nodes are unlabeled.) In general, select 
any labeled, unscanned node, say node i, and search for all unlabeled nodes j 
such that (i, j; 2) is an arc with 

(3.6) d(i, j; 2) = 0. 

Label such nodes j with [i, 2, +, e(j) = <»]. (Such nodes j are now labeled and 
unscanned and node i is labeled and scanned.) Repeat the general step until 
either the node n is labeled and unscanned, or no more nodes can be labeled and 
node n is unlabeled. In the former case, terminate. In the latter case, go on to 
the Second Labeling. 

2 . Second Labeling. Nodes labeled above retain their labels, and the labeling 
process continues as follows. All nodes revert to the unscanned state. The general 
step: Select any labeled, unscanned node, say node z, and scan it for all unlabeled 
nodes j such that either 


(3.7) 

(i,j; k) is an arc with a(i 

. J 5 fc ) = 0 ,f{i,j\k) 

< c(i,j; k) 

or 





(3.8) 

(j, t; k) is an arc with 

a(J, i; k ) = 

0 , f(j, i 

;k) >0. 

If (3.7), 

assign j the label [i, k, +, < 

=0')]> where 



(3.9) 

«C/) = min [e(i), 

k) - 

*)]. 
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If (3.8), assign .7 the label [z, k, —, c(/)], where 

(3.10) e(j) = min [e(i)J(j, i ; A)]. 

(Such nodes j are now labeled and unscanned and node z is labeled and scanned.) 
Repeat the general step until either node n is labeled and unscanned (break¬ 
through), or until no more labels can be assigned and node n is unlabeled (non¬ 
breakthrough) . In case of breakthrough, go on to routine B. If nonbreakthrough, 
go to routine C. 

B. Flow Change . The labeling process has resulted in breakthrough. Change 
the flow / as follows. Node n will be labeled [j, k, +, «(n)]. Add e(n) 
to /(/, n; k ); then go on to node j and its label. The general step: if node p is 
labeled [z, k , +, e(p)], add e(n) to/(z, p\ k); if node p is labeled [z, k, —, e(p)], 
subtract e(n) from f(p, z; k ); in either case, go on to node z and its label. Repeat 
the general step until node 1 is reached, that is, e(n) has been added to some 
/(l,/; k). Then discard the labels and go back to A. 

C. Node Number Change . The labeling process has resulted in nonbreakthrough. 
Single out the following subsets of arcs: 

( 3 . 11 ) Ax = {(z,/j k) | z labeled, j unlabeled, a(i,j; k) < 0 }, 

( 3 . 12 ) A 2 = {(z,/; k) | z unlabeled, j labeled, d(z,/; k) > 0 }. 

Define 

(3.13) 8i = min [-d(z,/; A)], 

A\ 

(3.14) h = min [a(i, j; &)], 

A2 

(3.15) 8 = min (5x,5 2 ). 

Change the node integers r(z) by subtracting 8 from all r(z) corresponding to 
unlabeled z. Discard the labels and go back to A. 

4. Discussion of the Computation 

The starting set of node integers r(z) and the zero flow / satisfy the optimality 
properties (3.1), ( 3 . 2 ), (3.3), since d(z, j; k) ^ 0 for all arcs. Moreover, if we 
enter the iterative procedure with node numbers r(z) and a flow / that satisfy 
these properties, and if breakthrough occurs, the new function f obtained from 
/ by adding the positive integer e(n) to all/(z, /; k) corresponding to forward 
arcs of the path from 1 to n, and subtracting e(n) from all/(z, j ; k) corresponding 
to reverse arcs of the path, is again an integral flow (of value v + e(n)) from 
1 to n. Moreover, the old node integers r(z) and new flow/'(z, /; k) again satisfy 
(3.2), (3.3), simply because the flow changes made occur in arcs for which 
<*(z, j;k) = 0 . 

If, on the other hand, nonbreakthrough occurs, then the node number change 
8 of routine C is a positive integer and the resulting node integers r'(z) and flow/ 
again satisfy the optimality properties. We first check that 8 is well defined, i.e. 
that at least one of the sets of arcs A x , A 2 defined by (3.11), ( 3 . 12 ) is nonempty. 
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Indeed cannot be empty. For suppose A x were empty. Since there is a chain 
from 1 to n in the project network, and since 1 is labeled, n unlabeled, there 
must be a pair of arcs (i, j; k), k = 1, 2, in the enlarged network with i labeled, 
j unlabeled. Then a(i, j ; k) ^ 0 for this pair of arcs. It follows from (3.3) and 
labeling rule (3.7) that f(i, j ; k) = c(i,j; k), hence/(i, j; 2 ) = °o. But this is 
absurd. Consequently S is a positive integer. We show next that for any 8' satis¬ 
fying 

(4.1) 0 ^ 5' ^ 5, 
the node numbers 

Ir(t') for i labeled, 

(4.2) r'(i) = 

(r(t) — 8' for i unlabeled, 

and the flow / again satisfy the optimality properties. Thus suppose 

(4.3) k) = a(i,y, k) + r'(*) - r'(j) < 0. 

We need to verify that f(i, j-, k) = 0. If a(i, j; k) < 0, this is immediate. If 
a(i, j; k ) = 0, it follows from (4.2) and (4.3) that i is unlabeled, j labeled at 
the conclusion of the labeling process. Hence, by labeling rule (3.8),/(i, j; k) =■ 0, 
as otherwise i would be labeled from j. Finally, suppose S(i, j; k) > 0. Then, 
since a'(i, j; k ) < a(i, j-, k), we again have i unlabeled, j labeled. But then the 
arc (*, j; k ) is in A 2 defined by (3.12), and hence a'(i, j; k) S 0, contradicting 
the-assumption k ) < 0. Thus this case cannot occur. 

A similar proof shows that if a'(i,j; k) > 0,then/(»,;;*) = c(i,j;k). (Hence, 
in particular, we cannot have a'(i, j; 2 ) >0.) 

This completes the proof that the outputs of the iterative procedure again 
satisfy the optimality properties if the inputs do. 

That the algorithm terminates after finitely many applications of the labeling 
procedure can be seen in various ways. One way is as follows. Suppose that the 
algorithm fails to terminate, so that an infinite sequence of breakthroughs and 
nonbreakthroughs occurs. The number of breakthroughs in this sequence is finite. 
For otherwise, since the flow change following breakthrough is a positive integer, 
flows having arbitrarily large values v would be produced. But if the algorithm 
produces a flow / having sufficiently large value v, there must be at this stage a 
chain from 1 to n of arcs corresponding to A: = 2 such that/(z, j; 2) > 0 on arcs 
of tins chain. Hence, since a(i, j ; 2) g 0 throughout the computation, we have 
2) = 0 for arcs of this chain. But then the first labeling results in termina¬ 
tion. This leaves only the possibility that infinitely many successive nonbreak¬ 
throughs occur. This possibility is eliminated by observing that, following non- 
breakthrough, all nodes that were previously labeled can again be labeled, and, 
m addition, at least one more node can be labeled. The first part of this statement 
follows from the fact that for labeled i and j, the new d'(i, j-,k) are equal to the 
old o(t, j; k) ; the second by looking at an arc in Ai or A 2 that determines 8. 

To sum up, the algorithm produces successive integral flows and node integers 
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that satisfy the optimality properties (3.1), (3.2), (3.3), and eventually termi¬ 
nates. It is important to be a little more precise about the first part of this state¬ 
ment, in the following sense. Suppose that between two occurrences of nonbreak¬ 
through in the computation, a number (possibly zero) of breakthroughs occur. 
Let t\ and r 2 denote the node integers produced following the two nonbreak¬ 
throughs and let / be the last flow produced by the intervening breakthroughs. 
Then / minimizes (2.25) for all X between n(n) and r 2 (n) = n(n) - 5. We 
shall use this fact later on. 

We next verify that (2.26) defines optimal job times corresponding to X = r(n). 
To this end, one can go back to the original pair of dual programs (2.5)~(2.9) 
and (2.11)-(2.13), using also (2.14), (2.15) to define g and h , and (2.18) to 
define /. It suffices to show that 

(4.4) r(i, j) + r(i) — r(j) < 0 =*/(z,/) = 0, 

(4-5) r(i,j) < b(i,j) =>g(i,j) = 0 , 

(4.6) r(i,j) > a(i,j) =» h(i, j) = 0, 

since (with r( 1) =0, r{ri) = X) these are optimality properties for primal and 
dual. These implications follow in a straightforward manner from ( 3 . 2 ) and 
(3.3). For example, suppose the hypothesis of (4.4) holds. Then r(i , j) = b(i, j ), 
hence b(i,j) + r(i) — r(J) < 0. Consequently a(i,j) + r(i) — r(j) < 0 also. 
It then follows from (3.2) that/(i, j\k) = 0 , k = 1 and 2 , hence/(z, j) = 0 , 
verifying (4.4). The others may be proved similarly. 

Thus each new set of event times r(z) yields a new point on the project cost 
curve by defining job times r(z, j) as in (2.26) and calculating the project cost 

(4-7) P(X) = P[r(n)] = — c(i 9 j)r(i,j)]. 

The project cost P(X) is linear between successive values of X = r(n) generated 
in the computation. For let Xi > X 2 be two successive X’s and suppose 

(4.8) Xi ^ X ^ X 2 . 

Let / be the flow that produced the node number change yielding X 2 from X x 
and suppose/ has value v. We have earlier pointed out that / minimizes (2.25) for 
all X satisfying (4.8). Hence for such X 

(4.9) P(X) = K - [\v - ]£a(i,/; k)f(i,j; k)]. 

ij.k 

Here K is the constant 

(4-10) K = Z [*(*, j) - b(i, j)c(i, j)]. 

ij 

Thus 

(4.11) P(X) P(Xi) = (Xi — \)v } Xi ~ X ~ X 2 , 

so that P(X) is linear in the interval (4.8). 

The equation (4.11) also shows how to pick out all breakpoints of the convex, 
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piecewise linear P( X). For suppose Xi > X 2 > X 3 are three successive values of 
r(n) generated in the computation, and let v be as defined above. Let v' be the 
value of the flow that produced the nonbreakthrough yielding X 3 from X 2 . Then 

P(X 2 ) P(X 1 ) = (Xi — X 2 )v, 

P(X 3 ) — P(X 2 ) = (\ 2 — \ 3 )v'. 

Consequently X 2 is a breakpoint of P(X) if and only if v < v', that is, if and only 
if there is an intervening breakthrough between the two nonbreakthroughs that 
yield X 2 and X 3 . 

For example, if a problem computation yields the sequence of breakthroughs 
and nonbreakthroughs (indicated by B and N) 

b®bbn®bnn® 

then the circled N suffice to define P(X). 

At the conclusion of the computation, a chain of arcs corresponding to k = 2 
has been located along which the equalities 

( 3 4 - 12 ) a(i, j) + T (i) — r(j) = 0 

hold. Summing (4.12) along this chain shows that X = r(n) is equal to the a- 
length of the chain. Consequently the project cannot be completed in any shorter 
time interval. 

A consequence of the algorithm is that, given integral data in the problem, 
all the numbers produced are integers. Hence, in particular, breakpoints of 
P(X) are integers, and so are the corresponding optimal job times. 

The method of this paper can also be used to compute project cost curves in 
case the given job costs are piecewise linear and convex between crash and normal 
completion times. This merely introduces more arcs into the network, in fact, 
one more arc from i to j for each additional breakpoint of the function giving the 
cost of job (z, j). 
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INCREASING THE CAPACITY OF A NETWORK: 

THE PARAMETRIC BUDGET PROBLEM* 

D. R. FULKERSON 

The RAND Corporation 

The problem considered in this paper is that of allocating a budget of re¬ 
sources among the links of a network for the purpose of increasing its flow 
capacity relative to given sources and sinks. 

On the assumption that the cost of increasing each link capacity is linear, a 
labeling algorithm is described that permits rapid calculation of optimal allo¬ 
cations for all budgets. 

1. Introduction. Suppose that a fixed budget can be allocated among the 
links of a network for the purpose of increasing its flow capacity relative to a 
given source and sink. How should the money be spent in order to maximize the 
resulting network capacity? 

In this note we assume that the cost of increasing the capacity of a link is 
linear and homogeneous, which permits direct formulation of the problem de¬ 
scribed above as a linear program, and then describe an algorithm that produces 
solutions to the problem, not only for a fixed budget, but for all budgets, i.e., 
we solve the problem parametrically. The algorithm uses a variant of the labeling 
procedure previously developed to solve maximal network flow problems and 
minimal cost transportation problems [1-4]. 

It is interesting that, although the budget problem does not fall within the 
class of transportation-type programming problems, it can still be solved by a 
labeling procedure. Roughly speaking, the underlying reason for this is that, for 
a given budget problem, one can find a pair of transportation-type linear pro¬ 
grams such that an optimal solution to the budget problem is given by a convex 
combination of certain optimal solutions to the two auxiliary problems. Indeed, 
our algorithm is designed to solve, efficiently, a sequence of such related trans¬ 
portation-type problems, the sequence having the property that adjacent pairs 
of solutions produced by the algorithm can be used to generate a solution of the 
parametric budget problem. 

Section 2, below, contains a formulation of the budget problem as a linear 
program and a statement of the dual program. In Section 3 we set up the se¬ 
quence of associated programs and include some heuristic discussion. Section 4 
provides a statement of the algorithm. A numerical example illustrating the 
computation is given in Section 5. Section 6 concludes with proofs that the al¬ 
gorithm produces solutions to the associated programs, and to the budget prob¬ 
lem. 

2. The Budget Problem. We suppose given a network consisting of nodes 
Po , Pi, • • •, P» and oriented links PJPj leading from Pi to Pj . Each link P t P y 
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has associated with it two integers: c*y, the existing flow capacity of the link, 
assumed nonnegative, and a*y, the cost per unit of additional capacity, assumed 
positive. We take Po to be the source for flow, P n the sink . 1 

Letting s»y denote the flow from Pi to Py along P*Py , the amount of capac¬ 
ity added to P t Py , b the total budget to be allocated for increased capacity, and 
v the net flow through the network from P 0 to P n , the problem is to determine 
nonnegative values of , y i} -, v that 

( 1 ) maximizes 


subject to the constraints 





X(®by 

7 

— Zyo) — V 

= 0 

( 2 a) 

£(*« 

7 

£yi) 

= 0 


X(z»i 

%jn) V 

= 0 

( 2 b) 


Vij 

< Oi. 

( 2 c) 


^Lj O'iiVij 

= 6 . 


(i = 1 , ••*, n - 1 ) 


Here, of course, b is assumed nonnegative. 

Clearly this problem will not, in general, have integral solutions, because of 
the presence of constraint ( 2 c). Nonetheless, almost all of the computation can 
be carried out in integers, as will be shown. 

For future reference, we note that if we assign constraints (2a) the multi¬ 
pliers Ti (i = 0, * * •, ft), constraints ( 2 b) the multipliers y ij9 and constraint 
( 2 c) the multiplier or, one finds the dual of program ( 1 ) and ( 2 ) to be 

(3) minimize X c*j 7»7 + be 

subject to 

(4a) —Xo + x„ > 1 

(4b) t, — Tj + yij > 0 

(4c) can ~ ya > 0 

(4d) ytJ > o. 

If the nonnegative numbers x ih v satisfy equations ( 2 a), we shall call x,-, a flow 
(from Po to P n ) and v the flow value. 

3. The Related Problems. Consider the sequence of problems 

( 5 ) maximize tv - £ a^yn (t = 1 , 2 , • • •), 

*»7 

each subject to constraints ( 2 a) and ( 2 b) in nonnegative variables. 


1 We might equally well assume that there are several sources and sinks, provided we are 
interested in flows from any source to any sink. However, this situation can always be re¬ 
duced to a single source and sink simply by joining all old sources to a new fictitious source 
by links of large capacity, and similarly for the sinks. 
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Notice that for t sufficiently large, e.g., if t is greater than the cost of adding 
a unit of capacity to each link of a chain from P 0 to P„ , the form (5) is un¬ 
bounded on the convex set defined by (2a) and (2b). Thus the sequence of re¬ 
lated problems we will need to consider is finite. We let T denote the largest 
value of t for which the form (5) is bounded. 

Now suppose x\j , y\j , v‘ solve the 2-th one of these problems, and define 

^ = £ > (2 = 1, • • - , T). 

Then it is easy to see that x \ 3 , y \ 3 , v* solve the budget problem for b = fe*. 
Moreover, the numbers b* will be monotone non-decreasing in 2. It might there¬ 
fore seem plausible that if we are given b such that b* < b < b t+1 , then a solu¬ 
tion to such an intermediate budget problem could be generated by expressing 
6 as a convex combination of b* and b t+1 , and taking the same convex combina¬ 
tion of the solutions x\ 3 , y\j , v* and xtf 1 , ylf 1 , v t+1 . This turns out to be almost 
right—that is, it is false that any two such solutions can be used in this way to 
solve an intermediate budget problem, but it is true that there exist solutions 
to the 2-th and (2 + l)-th related problems that do generate solutions for all b 
lying in the interval (b\ b t+1 ) associated with these particular solutions. 

The algorithm of the next section will, in fact, be shown to produce integral 
solutions x\j , y\j , v 1 (2 = 1, • * * , T) and hence a set of integers 0 = b 1 < b 2 < 

* • • < b T , such that 

(a) if b* < b < b t+1 , then a solution to the budget problem corresponding to 
b is given by a convex combination of x } 3 , y \ 3 , v t and xtf 1 , y\+ l , v t+1 ; 

(b) if b > 6 T , a solution can be obtained from xj 3 , y^ , v T . 

Moreover, the computation for the related problem 2 begins with the solution 
previously generated for problem 2 — 1, and thus the entire set of “spanning” 
solutions for the budget problems can be obtained efficiently. 

4. The Algorithm. Before stating the algorithm for solving the sequence of 
related problems, we note that the dual of problem 2 is to find numbers x/, one 
for each node Pi , and y\j , one for each arc PiP 3 , that 

(6) minimize c i3 y\j 

subject to the constraints 

(7a) — W + Kn > t 

(7b) *»•* — Tj + y\i > 0 

(7c) 0 < y\ 3 < a i3 . 

It follows that feasible solutions x\j , y \ 3 , v* and x/, y\ 3 to the primal and dual 
problems, respectively, which satisfy the conditions 

(8a) xo* = 0, x n * — 2 

(8b) 


x/ — t/ + yl 3 > 0 =* xlj = 0 
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(8c) y\j > 0 => x\j — y\j = aj 

(8d) ylj < an =» y\j = 0, 

are optimal solutions. 

The dual variables y,j and primal variables y\j need not be mentioned ex¬ 
plicitly in describing the computation. Instead, we shall deal only with node 
numbers x< and flows Xi ,, and will construct these to satisfy 


(9a) 

(9b) 

(9c) 

(9d) 

(9e) 

(9f) 


xo* = 0, x„* = t 

t _ t ^n0> 

TT% ^ 

ITi ~ Ti = 0 =» Xij < Cij 

t t t V 

Try — Ti = dij ==» > dj 

ir/ — < 0 => = 0 

0 < t / — ir/ < a*y =» x\j = c t *y. 


In addition, all variables will have integral values. 

It is easy to check that if there are node numbers w/ and a flow x\j such that 
(9a)-(9f) hold, then by defining 


( 10 ) 7ij = max (0, x / — x/) 

( u ) Vij = max (0, - as), 

one has feasible solutions to both primal and dual problems that satisfy (8a)- 
(8d), and hence are optimal. 

To start the computation, take x,° = 0 and zj, = 0. These clearly satisfy 
conditions (9) for t = 0. The computation now progresses by a sequence of 
“labelings” (Step A below), each of which can terminate in one of three ways: 
“finite breakthrough,” in which case the flow is changed (Step B), “nonbreak¬ 
through,” in which case the node integers are changed (Step C), or “infinite 
breakthrough,” in which case the computation ends, and T has been discovered. 

Themputs for the <-th application of the routine composed of Steps A, B, C 
are x< , z if . The node numbers x,- _1 are used to divide the links P,P, of the 
network into three classes as follows. A link PJPj is 0 -admissible, a-admissible, 
or inadmissible according as the value of x- -1 - rf 1 is 0, a {j , or neither of these. 2 

Step A . (Labeling process). 

(1) Assign Pq the label (Pn +1 j 00 ); consider Pq as unscanned. 

(2) Take any labeled, unscanned node P { ; suppose it is labeled (P* + , «). 
(Initially P 0 will be the only such.) To all nodes P, that are unlabeled and such 
that PtPj is a-admissible, assign the label (P< + , oo). Consider P, as scanned 


•Thus initially all links are 0-admissible. Steps A, B, C, then reduce to the algorithm 
of ref. 11] for constructing a flow of maximal value in a network with capacity limitations 

A** nn Imlro r * 




II-5—INCREASING THE CAPACITY OF A NETWORK 


63 


and the newly labeled Pj , if any, as unseanned. Repeat until either the sink P n 
has been labeled (infinite breakthrough), or until no new labels are possible and 
this is not the case. In the former case, terminate; in the latter case, proceed to 
(3) below. 

(3) (At this stage we have a labeled set of nodes including P 0 but not P n , 
and each has a label of the form (Pt, 00 ).) All nodes now revert to the un¬ 
scanned state, and the labeling process continues as follows. Take any labeled, 
unscanned node Pi ; suppose it is labeled (Pit, A). (Initially we have only 
labels of the form (Pit, 00 ).) To all nodes Pj that are unlabeled, such that PJPj 
is O-admissible, and xlj 1 < dj , assign the label (Pt, min (A, dj — xlj 1 )). To 
all nodes Pj that are now unlabeled, such that PjPi is O-admissible, and 

x-7 1 > 0, 

assign the label (Pf, min (A, XjJ 1 )). Next, if Pj is unlabeled and PjPj is a-ad- 
missible, label Pj with (Pt, A). (Initially, when we are labeling from a node of 
the starting set, this case cannot occur.) Finally, if Py is unlabeled, PyP, is 
q-admissible, and xlt > tji, label Pj with (Pf, min (A, XjJ 1 — cy;)). Consider 
Pi as scanned and the newly labeled Pj , if any, as unseanned. Repeat until 
either the sink P n has been labeled with, say, (Pt, A), 3 or until no new labels 
are possible and this is not the case. In the former case (finite breakthrough), 
go to Step B. In the latter case (nonbreakthrough), go to Step C. 

Step B. (Flow change). 

(Here the sink P n has been labeled with (Pt, A).) Replace xlt by xlt + A, 
and go on to P* and its label. In general, if P* is labeled (Pj + , l ), replace x}t 
by x*t + A, and if labeled (P;“, Z), replace xlj 1 by xlj 1 — A, in either case 
turning attention then to Pj and its label. Stop the flow change when Po has 
been reached. Now discard the labels generated in (3) of Step A and repeat A3 
with the new flow in place of xlj 1 . 

Step C. (Node number change). 

(The labeling process has resulted in nonbreakthrough.) Give the present 
flow (which may or may not be xlj 1 ) the name xlj and define node numbers 
** by 

t JV *"" 1 if P* is labeled 

Ur + 1 if Pi is unlabeled. 


The entire routine is then repeated using w/ and xlj as inputs. 

In the concluding section we shall sketch proofs that the flows xlj generated 
in the computation have the properties discussed in Section 3, but perhaps some 
preliminary explanatory comments are in order. 

The labeling process A 1 -A 2 is a search for a chain from P 0 to P n of a-admissi- 
ble links. If none such exists, we proceed to enlarge the search (A3) in an attempt 
to find a path from P 0 to P n of admissible links (where the word “path,” as 

* The sink P« will never receive a label of the form (P*, A), since every flow generated 
by the algorithm will have x n j =» 0. 

Similarly each flow will have Xjo = 0, so that any node P, labeled from Po will have a 
label of the form (Po, A). 
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opposed to “chain,” means that a li n k may be traversed opposite its orientation 
in going from P 0 to P„) having the property that the (integral) flow change h 
made along the path (Step B) is positive and yields a flow again satisfying (9c) 
and (9d). Inadmissible links correspond to (9e) and (9f), and in these we keep 
the flow fixed, so that these conditions are also maintained. Thus, if we enter the 
routine with node numbers x' _l and a flow xlj 1 satisfying (9c)-(9f), the s».ma 
node numbers ic*~ l and the output flow x*j still satisfy (9c)-(9f), and conse¬ 
quently the output flow will again be a solution to related problem t — 1. In 
addition, it is a solution to related problem t (as can be shown using the trans¬ 
formation of node numbers given in Step C), and hence we can repeat the proc¬ 
ess. It is this fact that xjy solves both problems t — 1 and t —which enables 
one to prove that the sequence of flows Xi } -, • • • , xf } - produced by the algorithm 
are spanning solutions for the budget problem. 

5. An Example. Let the network be that of Fig. 1, the capacity c„ of link 
PiP,- being the number in the upper left of the box, and the cost a,-, of adding 
one unit of capacity being the number in the upper right. 4 Assume that we have 
the node numbers r* shown in the figure, and the flow Xi S indicated by the num¬ 
bers in the lower left of the boxes, and wish to compute x*j and Using the 
numbers , we divide the links into the three classes: O-admissible (indicated in 
the figure by a zero in the lower right of the box), a-admissible (indicated by an 


(P 4 + , ®) 



(P|, I) 


Pig. 1 


4 Links not shown in Fig. 1 are assumed to have zero capacity and large cost for addi- 
tional capacity. 




G. 3 

















66 


n-5—DETERMINISTIC DECISION MODELS 

1 „ ^ t wer of the box), and inadmissible (indicated by no entry in the 

lower right of the box). 

T p e 1 w eling process A1_A2 yields the labels (P 4 + , «) on P 0 and (P 0 + , «> ) 
nan / k i g ° 0X1 to Scanning P 0 gives no more labels, but from P 2 we 

fKI P 1 ^ 2+ » m ^ n ( 00 j 1))j and this completes the scanning of P 2 . 

^Notice that P x could also have been labeled with (P 2 ”, min («, 1)), since the 
or er m which the labeling rules of A3 are applied is immaterial.) Finally, from 
f 1 J ea to P 3 with the label (Pi + , 1), and have thus located a chain, 

oun y tracing the labels backward from P 3 , along which we can increase the 
flow by an additional unit. 

1 C ^ =Lan ^^ n ^ the flow, discarding the old labels, and relabeling, we obtain 
e a es shown in Fig. 2. Again we have a finite breakthrough, and therefore 
c ange the flow along the path indicated by the labels: add 1 to x n , subtract 1 
_ r0 ^. ^f 12 > an< d a dd 1 to Xoz . We then relabel, obtaining the labels shown in Fig. 
3. This time we have a nonbreakthrough, and thus go to Step C, the node- 
num er change. The flow shown in Fig. 3 is therefore x \ 3 -, and the new node 
numbers tt ; are given by adding 1 to the numbers on unlabeled nodes P t and 
: iro = 0, wx 4 = 3, ir 2 4 = 2, x 3 4 = 4. 

Observe that 

diji/ij = 22/02 4" lt /13 = 7, 

and thus if we are given a budget b = 7, we should boost the capacity of P 0 P 2 
by 2 units, that of P X P 3 by 3 units, thereby achieving a total flow of 6 units 
from Po to P 3 . On the other hand, we see from Fig. 1 that 

d*j Vij == 1^/13 == 1? 

i,3 

so that with 6 = 1, the capacity of P x P 3 should be increased by 1 unit, permitting 
a total flow of 4 units through the network. Notice also that 

3*> 3 - Z an y\j = 11 = 3« 4 - Z a i3 yh 

*•) i.i 

and hence x\ 3 - solves related problem 3 provided x \ 3 does. 

^ Theorems and Proofs. It is not difficult to see that if we enter Step A 
with a flow Xi 3 and obtain new numbers x % 3 via Step B, then Xi 3 is a flow also, 
since it is obtained from x i3 by adding a positive amount h to the flow in links 
of a path from P 0 to P n that are traversed with their orientation (in going from 
Po to P n ), and subtracting h from the flows in links traversed against their 
orientation. Moreover, h is no greater than the min im um of the link flows in 
the reverse oriented links of the path, so that nonnegativity is maintained. 

The routine composed of Steps A, B, C terminates. For if Al and A2 do not 
locate a chain of a-admissible links from P 0 to P n , let L be the set of indices of 
nodes that are labeled in Al and A2. Thus 0 € L, n t L. Now any flow x i3 - pro¬ 
duced via A3 and B satisfies x i3 < c i3 for all links P,P 3 that are not a-admissible. 
Hence, summing equations (2a) over i e L yields 

V ~ ^ 1 (*£»i •E/i) ^ ^ 1 Xij 

ill* %tL 
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and thus, since links PiPj for i e L, j e L are not a-admissible, we have 

V < X Cij . 
itL 
UL 

Consequently, since v increases by h > 1 with each occurrence of a flow change, 
there can be only finitely many of these. 

Thus, starting with the flow x\j = 0, the algorithm successively produces 
flows x\j for t > 0. 

Theorem 1. The flows xlj produced by the algorithm and the corresponding 

y\i = max(0, x\j — c iS '),v‘ = X ( x h ~ x io)> maximize the form tv — X a a Va 

y iJ 

subject to constraints (2a), (2b) in nonnegative variables , i.e. xlj , yn , and v 

solve related problem t 

It suffices to show that x/, xlj satisfy (9a)-(9f). 

Since it is clear that 

x« # = 0, x% = 0, y° i} = 0, v° = 0 

satisfy (9a)-(9f) with t = 0, we may proceed by induction on t 

Property (9a) is clear from the induction assumption xo 1 = 0, x n = t — 1, 
the node number change of Step C, and the fact that Po is labeled and P n un¬ 
labeled in case of nonbreakthrough. 

Consider (9b). Since xy 1 — x* 1 < a t -y, then w/ — x/ could exceed an only 
if xy" 1 - tt- -1 = a {j and x/ = xy" 1 + 1, x/ = irl~\ But then P;Py is a-admissi¬ 
ble, Pi is labeled and Py unlabeled at the conclusion of labeling, a contradiction. 

For (9c), suppose xy* — x** = 0, and consider cases. If x*” 1 — tT 1 < 0, so 
that xlj 1 = 0, then, since PiPj is inadmissible, we also have xlj = xlj 1 = 0 < Cn • 
If 7 T-- 1 — irT 1 = 0, so that x'n 1 < c i5 , again we have x\j < cn , since xlj 1 can 
be increased by at most cn “ xtj 1 hi a sequence of flow changes. If 

0 < xy —7 n < an , 

then PiPj is inadmissible and consequently xlj = xlj 1 = cn • Finally, if 

t-1 t-1 __ „ 

X, X* aij , 

then 

t t-i , 1 ^ t _ t~l 

Ti = x* -r 1, ?ry — ^ry > 

and hence P» is unlabeled, Py labeled at the conclusion of labeling. But if 

Xij > Cij , 

this is a contradiction, since P*Py is a-admissible. Hence xlj < c»y. This completes 
the proof of (9c). 

Proofs of the remaining properties can be given along similar lines, and so we 
omit them. 

Corollary. The flow xlj and its corresponding ylj , v i solve related 'problem t — 1. 
This follows from the fact that xlj 1 , yJJ \ v~ l solve related problem t - 1 and 
the remarks at the end of Sec. 4. 
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Suppose that the algorithm terminates after the T-th application of the routine 
composed of steps A, B, C, i.e., we enter step A with t,- t , xfy and infinite break¬ 
through occurs. Thus a chain of a-admissible links from P c to P„ , say 

Pto-ffi i PnP it j • • • , Pit_.jP i k (u> = 0, t* = n), 

has been located, and hence from (9a) and the definition of o-admissibility, it 
follows that 


(13) 


T = 


7T» ““ 7Tq 


= Zk’rfj+i ~ irf,) = Xoiii 


1—0 


Z-0 


Z*Z+l - 


Consequently this chain, of “a-length” T, has minimal a-length over all chains 
from Po to P* , since if T were greater than the a-length of some chain, the form 
P** 2 a a Vis would obviously be unbounded, contradicting the maximality 

of Tv T - Zavyl-. 

ij 

Let b = 2 a «> 2/i/ (Z = 1, • • • , T) be the successive values of 2 Oii ya 
produced by the algorithm. Then 


0 = b 1 < b 2 < ■ ■ ■ < b T . 

For on the first application of the algorithm, all links are 0-admissible, hence 
Xi J_< fij > or I**’ = 0- P° establish the monotoneity, assume that b‘ < b‘~\ Since 
Va , v and y is , v are respectively maximal in problems t — 1 and t, we have 

(t - 1) v*~ l - 6‘ -1 >{t- l)v* - b* 

tv* -b*> tv*- 1 - b*-\ 

whence adding gives 


v*- 1 < v‘, 

an inequality that is also clear directly from the algorithm. Thus, if b* < b‘~\ 
we get 


(t - 1 )v* -b‘> (t- l)v‘~ l - b*-\ 

a contradiction. 

Theorem 2. Letb = ab* + (1 - a)b t+1 , 0 < a < 1. Then 

Xfj = aXij -|- (1 — <*)£<,• 1 

Vu = ay\j + (1 — ajy** 1 
v — av* + (1 — a)v* +1 

solve (1) and (2). If, on the other hand, we have b> b T , then the flow and its 

corresponding Vij , v' obtained from *f, , yl , „ T by adding (1/T) (b - b T ) units 
of flow along the a-admissible chain (12), solve (1) and (2). 

While Theorem 2 can be proved directly, we choose to give a proof using the 
dual problem (3) and (4) in order to point out how to obtain solutions to the 
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dual of the budget problem from the node numbers generated in the algo¬ 
rithm. 

Inasmuch as t/ and the associated ylj given by (10) satisfy the constraints 
(7), it follows that 


(14) 



Yii 


t 9 


<T 


1 

t 


satisfy the constraints (4). Moreover, we have 

Z) on y\j = to* — b\ 

since x/, y\j are optimal for (6) and (7). Thus 

* X °ij y<J + btr = i (X Cij ylj + b ) 

i,j * *ti 

= i (tv* -b* + b) 
z 


= V* + i (6 - b*). 


Now since x*/" 1 , yif 1 , v t+1 and x‘ 3 -, y*j, v* both solve problem t, we have 

tv‘ +1 - b t+1 = tv* - b‘. 

Thus if 6 1+1 = b‘ = b, then v t+1 = »' = », and hence X c a 7<y + bo = v. If, 
on the other hand, b * < fe <+1 , we have 

1 v ,+1 - v * 

t fern _ » 


so that 


v = r‘ + (1 — a) (r ,+1 — v*) 


- v* + 


(b - b*) 
b t+1 - b* 


(V w 


-v‘) 


= v‘+±(b-b l ). 

Thus in either case, we see that 

(15) x c <> TO + be = v. 

ij 

Hence, since x ,,, yij , y satisfy (2), and , To- > * satisfy (4), it follows from 
(15) that they constitute optimal dual solutions. 

Suppose, finally, that 6 > b T . It follows from (9d) and the existence of the 
a-admissible chain (12) that 

ZD Va = b T + ~ (b ~ b T ) ^ j 
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and hence from (13) we have 

X an Vn ** b. 

Thus x'n, y'ij, v' satisfy (2). Defining 


(16) 


x'- = — 
x * y > 


i 

Jii 


T 

m 

T ’ 


I 

T 


again gives a pair of optimal dual solutions to the budget problem. 
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ON THE STATUS OF MULTISTAGE LINEAR 
PROGRAMMING PROBLEMS*! 

GEORGE B. DANTZIG 
The RAND Corporation , Santa Monica , California 

PART I—SPECIAL CASES 
Introduction 

Typical of the multistage problems are those encountered in dynamic problems. 
If the time span is divided into periods, the initial inventory provides the input 
for activities that occur in the first period or first stage. The output from the 
first stage provides the inventory input for activities in the second period or 
stage, etc. [9], The matrix of coefficients in the linear programming model then 
takes on the special form (1). 

«-> 

i i 

i i 

i_» 



t -- i-» 

i ii i 

i ii i 

i_i i-» 

To illustrate, consider the well-known warehousing problem [2, 5, 6, 17] of how 
much to purchase and sell of a commodity each month, given a fixed warehouse 
capacity, storage costs and expected prices from month to month; let a = the 
fixed warehouse capacity, sq = the initial stock in warehouse. 

Consider a seasonal product to be bought and sold for each of i = 1,2, • * • , n 
periods. For the ith period 

d — cost per unit, 

Pi = selling price, 

Wi = warehouse cost per unit, 

Xi = amount sold, 

yi = amount purchased, 

ii = amount in stock after sale of old stock, 

Si = amount in stock after purchase of new stock, 

Ui = unused warehouse capacity. 

* This paper was presented before the 1957 meeting in Stockholm of the International 
Statistical Institute and published in their I.S.I. Bulletin Vol. 36, Part 3. 
t Received December 1958. 
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and hence from (13) we have 


Uij ya — b. 

Thus x'ij , y'ij, v' satisfy (2). Defining 
(16) » 


Ti 

T * 


/ 

7*7 


T 

7*7 


*■' = 


1 

T 


again gives a pair of optimal dual solutions to the budget problem. 
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Introduction 

Typical of the multistage problems are those encountered in dynamic problems. 
If the time span is divided into periods, the initial inventory provides the input 
for activities that occur in the first period or first stage. The output from the 
first stage provides the inventory input for activities in the second period or 
stage, etc. [9]. The matrix of coefficients in the linear programming model then 
takes on the special form (1). 

i-» 

i i 

i i 

i_i 

i-1 

i i 

i i 

i_i 

•-1 i-» 

■ i i i 

■ i i i 

i_i i-» 

,-, r -1 

i i i i 

• ii i 

i_i i-1 

To illustrate, consider the well-known warehousing problem [2, 5, 6, 17] of how 
much to purchase and sell of a commodity each month, given a fixed warehouse 
capacity, storage costs and expected prices from month to month; let a = the 
fixed warehouse capacity, $o = the initial stock in warehouse. 

Consider a seasonal product to be bought and sold for each of i = 1, 2, • • • , n 
periods. For the ith period 

Ci — cost per unit, 

Pi = selling price, 

Wi = warehouse cost per unit, 

Xi = amount sold, 
yi — amount purchased, 

U = amount in stock after sale of old stock, 

Si = amount in stock after purchase of new stock, 

Ui = unused warehouse capacity. 

* This paper was presented before the 1957 meeting in Stockholm of the International 
Statistical Institute and published in their I.S.I. Bulletin Vol. 36, Part 3. 
t Received December 1958. 
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All of these quantities will be assumed to be nonnegative. In this 'problem we can 
distinguish two types of stages. If the activities that take place within a period 
are considered as forming a stage, then within a period there are two sw&stages: 
the selling stage which takes place before the purchase stage. The relations within 
and between periods are as follows: 



t 3 + y 3“®3 = 0 


— tj^ * 0 

It will be noted that each variable appears in either one equation or it appears 
m two equations with opposite signs. This, however, is the condition that a 
linear programming problem be a Hitchcock-Koopmans transportation (distribu¬ 
tion) problem [7,21,23]. With the conditions s, ^ oc the problem clearly belongs 
to the class of so-called “capacitated” transportation problems and suggests 
that the problem can be viewed as a “network flow” problem with capacity 
restraints on arcs of the network [11, 16, 18, 20, 29, 30]. 

The network has an exceptionally simple form because of the stagewise 
character of the problem. In (4) each node i in the network corresponds to the 
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iih equation; each arc joining two nodes corresponds to a “shipment” from i 
to j; i.e., a variable that has (—1) in equation i and (+1) in equation j; an arc 
with one node corresponds to an exogenous shipment to or from i; i.e., a variable 
with a single non-zero coefficient +1 or —1, respectively. The equations state 
that the sum of flows into or out of a node must balance to zero. 



Yl Y2 Y3 


Techniques have been worked out for solving network flow problems rapidly by 
hand even when a great number of periods is involved. Moreover, it is just as 
computationally tractable if upper bound restraints on the amount that can be 
purchased or sold in any one period are imposed on the variables ; or more generally 
if there are incrementally increasing costs per unit purchased or decreasing prices 
per unit sold [8, 10, 11]. 

Dynamic Leontief Models with. Substitution 

The warehouse problem can be reduced to another important class of problems 
by the following steps. We first substitute = U + yi in the remaining equations 
(and the objective form), yielding 



”* > l x l +w l t l + ^l y I - p 2 x 2 +w 2 t 2 + ^2 y 2 “ P 3 X 3 +W 3 t 3* C 3 y 3 “ p 4 x 4 +w 4 t 4 * 2 ( rajLn ) 
where c* = c* + w t . 

An optimal solution to (5) will be the same if (5) is augmented by the equations 
Si = U + yi for Si ^ 0 if U ^ 0 and y t ^ 0. System (13) which is formed from (5) 
by linear combinations shown in the right margin is clearly equivalent to it. 
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Price Operation. 



" p l x 1 +w 1 t 1 +c 1 y 1 - P2 x 2 + ' w 2 t 2*' c 2 y 2 “ p 3 x 3 +w 3 t 3 +c 3 y 3 “ p *» x 4 +w 4 t 4 = 2 


In this form the system displays some remarkable properties. In the first 
place, it is still in transportation format, i.e., each column has one +1 and one 
-1 or a single +1 entry. The corresponding network takes the simple but quite 
different form from (4), and suggests that it may be worth while to make a study 
of equivalent networks obtained by manipulations of the equations 



Of greater importance, however, is that each column in this system has one 
positive coefficient and the right-hand side is nonnegative. Hence this is a 
Leontief-type Model with Substitute Activities [12]. Accordingly, the Samuelson 
Substitution Theorem [12, 24] may be applied. It states that there exists an 
optimal solution in which either x x and/or t x appears as a basic variable, similarly 
either u x and/or y x , z 2 and/or <»,•••, * 4 and/or h . This fact is evident if the 
right-hand side is perturbated to be positive. Since the number of basic vari- 
ables equals the number of equations, only one of each pair can occur. This 
shows that in an optimal solution, at the selling stage, either the entire inventory 
will be sold (ti = 0) or there will be no sales fa = 0); at the buying stage, either 
the inventory will be at capacity fa = 0) or there will be no purchases fa = 0). 

The most remarkable property about such models concerns the fact that the 
optimal choice of basic activities depends only on the objective form and not upon 
the right-hand side. This is always true under mild conditions (e.g., there exist 
easible solutions to the system for at least one positive right-hand side and z 
has a finite lower bound). Suppose in general we have a “block triangular ” 
linear programming problem [11] of the form 
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AnXi — 6 (1> 

A 21 X 1 -\~A 22 X 2 = b® 

( 8 ) Az\X\ + A32X2 4 “ A33X3 = b® 

A 41 X 1 + A 42 X 2 4” A 43 X 3 + A 44 X 4 = b (4) 

C<»X 1 + C*>Xi + C< 3 >X 3 + C^Xa = 2 


where A a are matrices, Xi is a vector of activity levels for the first stage, X 2 a 
vector of activity levels for the second stage, etc., • * • , C (i) is a row of costs, and 
& (l) a column of constraints. We assume further that (8) is also a Leontief Sub¬ 
stitution Model so that each column of coefficients has one and only one positive 
coefficient and this occurs in the diagonal matrix A a . Also we assume all com¬ 
ponents of b (i) nonnegative. 

Let B be any starting feasible basis. For example, in the warehouse model, the 
submatrix of coefficients resulting from arbitrarily selecting one of each of the 
paired variables can be used as a starting basis. Let B be partitioned, see (9), 


(9) 




to correspond to (8) and let 7 (i) be the coefficients of C (i) corresponding to the 
columns in the basis. It is easy to see that the diagonal arrays Bn must be square 
Leontief matrices , each possessing an inverse. In solving a general model of this 
type by the revised simplex method, it is only necessary to maintain the inverse 
of these diagonal submatrices rather than the inverse for all of B from iteration 
to iteration. 

Optimization in this model consists of solving a number of smaller Leontief-type 
models—one for each stage . What is determined is the optimum choice of activities 
for the last stage (but not their activity levels). This is followed by the optimum 
choice of activities for the next to last stage, etc., until an optimum choice is 
known for the first stage. Once all the columns in the basis are known, the ac¬ 
tivity levels for the first period can be computed, then for the second period, 
etc. To illustrate the procedure just described in a little greater detail, the first 
step is to compute the prices associated with equations for the last period by 

( 10 ) = y^Bll 


If all components of the vector 


( 11 ) 


C(4) _ t w Au 


are nonnegative, the activities in the basis associated with the last period are 
the optimal choice. If not, the column corresponding to the most negative com¬ 
ponent is next substituted for the column in the basis with a positive coefficient 
in the same row (the “substitute” activity) to form a new basis. It should be 
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noted that the activity levels are not computed. The process is repeated until 
an optimum selection of columns for the basis for the last period has been de¬ 
termined. Once obtained, it is never changed. 

The process is now repeated for the third period by using the final *-«> pricing 
vector to replace C (3) by 

( 12 ) C® = (?<*> - Ti A 4i . 

Then prices are next computed by 

(13) 7r® = y^Bll 

where f ® are the components of C* corresponding to columns in the basis. 
The choice of third stage columns is optimal if all components of the vector 

( 14 ) C® - 


are nonnegative, etc. It is seen that solution of such a dynamic system reduces to a 
sequence of single period problems. 

For our four-stage warehouse example this is particularly simple. Let r t denote 
the price associated with the ith selling stage and if, the price associated with 
the tth purchasing phase. Then from (10)-(14), the optimal program is found 
by following seven easy steps: 


Computation 

1. 7r 4 = Min (—p 4 , Wi) 

2 . x 3 = Min (0, cz + x 4 ) 

3. tz = Min (—pz + x 3 , w z + x 4 ) 
(15) 4. x 2 = Min (x 3 , c 2 + x 3 ) 

5. x 2 = Min (— p 2 + t 2 , w 2 + tz) 

6 . Tfi = Min (tt 2 , ci + T2 ) 

7. xi = Min (-pi + t 1 ,w 1 + x 2 ) 


Decision 

Sell if first term 1 is minimum 
Buy if second term is minimum 
Sell if first term is minimum 
Buy if second term is minimum 
Sell if first term is minimum 
Buy if second term is minimum 
Sell if first term is minimum 


The Generalized Warehouse Model: Charnes and Cooper [6] have shown a 
amilar result for the case of several commodities sharing the same warehouse 
This is interesting, for as a rule the theory of multicommodity transportation- 
type problems lacks the elegance of the single commodity case. We illustrate 
this result for three periods and two commodities; the procedure is general. 
If the variables and constants associated with the second commodity are denoted 
y primes, then, analogous to (4), we have the network two-commodity flow 


n Its’ell Pi ~ Mb (_P4 ’ W<) ’ thCn Sdl in f ° Urth SelUng stage; if Wt = Min Vi > Wi) > 
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x i yi X 2 y 2 x 3 y 3 x 4 



x i Yi x 2 y 2 x 3 y 3 x ' 4 


where the heavy dot • represents the warehouse capacity equation in various 
time periods. The formal equations are given in (17). This system is equivalent 
to (18) where the steps required to derive the equations from (17) are shown on 
the right. It is clear that (18) satisfies the conditions for both a transportation 
problem and also a Dynamic Leontief System with Substitution. 

From the latter it follows that at each selling stage either the entire product is 
sold or none, at each buying stage at most only one product is purchased or none. 
It may be solved iteratively backward from the last period for determination 
of the optimal choice of activities independent of the right-hand side. The de¬ 
cision rules analogous to the one-commodity case (15) are just as easy to set 
down. 


Solving Dynamic Problems from Steady State Problems 

Perhaps one of the most exciting ideas to date is a possibility that the solution 
to a dynamic problem might be obtained as a by-product of an iterative pro¬ 
cedure for solving a steady state problem. 

Ford and Fulkerson in [20] first used the Primal Dual Algorithm to find the 
maximal steady state flow in a network with fixed capacity on arcs. Later they 
tried out their methods on a dynamic network problem where the objective 
was to maximize the total flow in T time periods. For this model, in addition to 
the fixed capacity, there was a time to traverse each arc. They discovered that 
by slightly altering their algorithm for the solving of a steady state problem the 
successive cycles were producing optimal solutions to first a T = 1, then a 
T = 2, then a T = 3, etc., dynamic network flow problem. 

The interesting open question is whether this idea can be generalized. 

The Functional Equation Approach 

This approach has been developed with special reference to multistage proc¬ 
esses [3]. At the beginning of each stage there is a status vector which typically 
represents the inventories available to perform activities within the stage and 
subsequent stages. The structure of the model being such that the status vector 
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represents the only connection between these activities and those that occur prior 
to the stage. The activities within a stage transform the status vector into a new 
status vector for initiation of the next stage. In our warehouse example the status 
vector has only one component , either s t at the beginning of a selling stage or U 
at the beginning of a buying stage. 

The following is an elaboration of a proposal by R. Bellman [4]. Let us suppose, 
for a general multistage linear programming model (1), that there is only one 
variable Si shared in common between stages which we will call the status. 



In order to solve the problem we begin by fi ndi n g the optimal program for the 
last stage, see (20), where s 3 is treated as a parameter. This is done by solving the 
linear programming 


s 3 



problem (20) for a particular value of S 3 (say, S 3 = 0). If the simplex method is 
used for the minimization, the solution to the dual is obtained as well. This 
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permits use of a technique known as parametric linear programming—a variant 
of the dual simplex algorithm—by means of which s 3 can be varied over any 
specified range of values, yielding the contribution z 3 of the activities in the 
fourth stage to the objective form as a function of s 3 . Now 23 = z 3 (s 3 ) is a broken 
line convex function of s 3 and only the values at the breakpoints are recorded- 
those between are available by linear interpolation; see (21). 



S 2 S3 





*2(^2) [minimum] 


This is not a standard linear programming problem because we have a broken 
line function z 3 (s 3 ) instead of the usual linear function of s 3 in the objective 
form. Since z 3 (s 3 ) is convex, it is possible [8, 10, 11] to substitute for s 3 : 

(23) S3 = + x 2 H-+ X*, (0 SX, | hi), 

23($3) ~ € 1 X 1 + € 2 X 2 + • * • + €)fcXfc , 

where *i < €2 ^ € * are slopes of the broken line segments 

ana ^ , h t , * * • , h k are the widths of the intervals between dotted lines in (21). 
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After the substitution the problem is again a standard linear programming prob- 
lem; hence we may repeat the procedure of stage IV for stage III, generating a 
convex broken line function z 2 (s 2 ). 

Where upper bound methods are available it will not be necessary to express 
the conditions 0 =£ X, :g hi explicitly [11], Under this approach no extra equa¬ 
tions are required and very little extra work. If an upper bound technique is 
not available one can substitute instead 

(24.0) = aiXl + 02X2 + * * ■ + ajb+iXfc+i, Xf ^ 0, 

23 ( 53 ) = CiXi + c 2 \ 2 + * * • + Cjfc+iXfc + i 
and add the condition 

(^•1) 1 = Xi + X 2 + * * • + Xjfc + i, 

where (a z , c%) are the values of s z and z z at the ith breakpoint. 

Continuing in this manner, we can compute successively the functions z z {s z ), 
^ 2 (^ 2 ), £i($i). Since the initial status, , is known we are now in a position to 
solve for the optimal program for the first stage activities. The value of si thus 
determined may be used to determine the optimal program for second stage 
activities, including the value of s 2 , which in turn permits the solution for the 
third stage, etc. 

In theory, the functional equation approach could be extended to the case 
where there are two components in the status vector, say ($, t), which are the 
shared variables between successive stages, by working out the convex of possible 
values of $, t and the value of z(s, t). The resulting convex surface in three di¬ 
mensions could be represented by its vertices circled points in (25.0). As an 
alternative, the 



t 





82 


III-6—DETERMINISTIC DECISION MODELS 



t 


value of 2 at a number of grid points (25.1) could be determined and these points 
used instead of vertex points to approximate the surface. It is clear, however, 
that when stages are tied together by more than one variable, the functional 
equation approach becomes increasingly tedious to apply. 

PART II—THE GENERAL CASE 
The Need to Solve Large Scale Systems 5 

At the present time it is possible to solve linear programming problems of the 
order of two hundred equations and almost any number of variables with rea¬ 
sonable accuracy and costs on electronic computers. Codes are available for even 
larger systems, but the increased time for solution and the increased accuracy 
requirements place a practical limit on the size of systems that can be solved by 
general linear programming techniques. 

For linear programming problems involving matrices which exhibit a special 
structure, it seems possible to develop special techniques that can extend the 
size of systems many times. 

Thus typical of the large scale problems encountered in practice are those con¬ 
cerned with distributing a homogeneous product from several sources to multiple 
destinations. For example, the optimal shipping program for a milk company 
with twelve sources (milk shed areas) distributing canned milk to two thousand 
warehouses requires the solution of a system of more than 2000 equations in 
24,000 unknowns. Fortunately there is a highly developed theory for Koopmans’ 
Transportation-type Problem, of which this is an example, which makes it 
possible to solve systems of this size [7, 14, 19, 24, 29], 

5 The remarks in this section are similar to those found in 10. 
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Air transport problems [26] and communication problems [22] have structures 
similar to but unfortunately more complicated than the classical transportation 
problems. Because they are “multi-index” problems, even the simplest of such 
systems, while very special in structure, can be enormous in size. 

Consider, for example, the problem of routing cargo aircraft. Let the variable 
Xijk represent the number of aircraft of type k routed between city i and j. Let 
us distinguish between six types of aircraft and twenty cities. In addition, con¬ 
sider a second set of variables yw which is the tons of cargo shipped between 
city i and j on the way to Z. Our equations become 


Aircraft in = Aircraft out: 2 x CJ k — 2 #*<* (k - 1, 

j * 

••• ,6)(c = 1, • 

• ,20) 

Cargo in = Cargo out: a cl + 22 Vai = 22 y«i + b ct 

j » 



(l = 1, • 

Tonnage Cap. > Tonnage Req.: 22 Kix, j* = E !/>n 

k l 

•• ,20)(c = 1, • 

■ ,20) 

(i = 1, - 

Plane Months Available: 22 22 = P* 

f-H 

II 

• ,20) 


i 3 


where a c i is tons of cargo arising at c for Z, b c i = 0 for c ^ Z, and bu is total 
requirements at Z. As we see again such a system involving only a few cities, 
type aircraft, and cargo destinations can generate easily a system in 1000 equa¬ 
tions in 10,000 unknowns. 
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There have been some efforts to develop a theory for these generalized trans¬ 
portation-type problems with only meager results. On the other hand, many 
numerical examples have been solved by hand methods, suggesting that many 
difficulties that could arise in theory are not common in practice. 

As a second example, consider a hypothetical but typical problem encountered 
in programming an industrial complex—in this case in the expansion of “motor” 
production—let us say a special type motor that requires a special type of steel 
and must use tools fabricated from this steel and the tools which fabricate these 
tools also use this steel. The tools that fabricate steel we will call below steel 
capacity, those that fabricate tools—tool capacity, and those that fabricate 
motors—motor capacity. The initial inventories must satisfy the first 5 equations 
in detached coefficient form given in the tableau, while the outputs from the 
activities in the first time period must balance, in the next 5 equations, the cor¬ 
responding inputs for activities in the second time period, where d% is a given 
demand for motor stocks. 

If a planner is interested in developing a program over two years by quarters 
that meets a specified schedule of known sales and creates the largest stockpile 
of motors for any future sales that may develop, then the pattern of coefficients 
in the tableau must be repeated for eight time periods. If we denote the upper 
and lower blocks by A and B 1 respectively, the model has the form 



The resulting system of 40 equations in 80 variables with the objective to maxi- 
mize a stockpile of motors, can be solved in less than a half hour on a modern 
electronic computer. Let this planner now decide that his model is entire too 
coarse and that he must plan by months , distinguish two types of motors and 
two types of steel and our resultant system becomes 7 X 24,14 X 24 or 164 X 336. 
At this size the computation would require now a few hours. However, should 
the planner again decide to refine the model—either with smaller time periods, 
a finer breakdown of various commodities, or geographical location, he would 
discover that general linear programming facilities were inadequate for his prob¬ 
lem. Yet techniques such as described in the next section have been applied to 
the 40 X 80 system, resulting in an optimal solution in a few hours by hand. 



Ill—6—MULTISTAGE LINEAR PROGRAMMING PROBLEMS 


85 


Solving General Block Triangular Systems 

It has been observed that the vast majority of practical problems falls into 
the block triangular class. The successful computation of the special case of the 
Dynamic Leontief Substitution Model suggests that a similar approach might 
be tried for the more general case. Consider a simple three-by-four transporta¬ 
tion problem. Its array of coefficients takes the form 

(29) 



- a 0 


- b* 




= b i. 


where we have assumed “slack” in the “column” equations to bring out more 
sharply the structure. 3 A similar structure can be observed for air transport 
models [26] and communication models [22]. However unlike the substitution 
model, a basis, see (9), drawn from a structure such as (29) does not in general 
have the property that the diagonal submatrices Bu are square and nonsingular. 
For example, the basis associated with the variables xn , xu , x^z , x ^, x 3 4 , x& , 
Xaz takes the form 


I i 


(30) 


3 The General Block Triangular Case consists of submatrices (blocks) on and below the 
diagonal as in (8). When these blocks are vacuous except along the diagonal and bottom 
strip as in (29) it is called Angular. 
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Here the first and second partitions have an excess of columns over rows while 
the last partition has a shortage* This situation is quite typical and greatly com¬ 
plicates the computations. Research work has been concentrated on trying to 
reduce the computation time of such systems in two main ways: 

(a) decreasing the number of iterations; 

(b) finding a compact form for the inverse of the basis. 

Experience with larger systems of the order of 20 equations in 500 to 1000 un¬ 
knowns indicates that they tend to go many hundreds of iterations (using the 
simplex method) before an optimum is reached. This is most unfortunate as the 
number of operations required to use the inverse of a basis per iteration goes up 
roughly as the square of the number of equations and more decimal places have 
to be carried to maintain accuracy. 

To cut down the number of iterations, a number of variants of the simplex 
method have been proposed whose general purpose is to replace the usual phase 
I of the simplex method (which seeks a basic feasible solution) by a procedure 
that produces either optimal or near optimal basic feasible solutions. Of these I 
mention a few proposals: 

(a) Beale Method of Leading Variables . A variant of the dual-simplex method 
[25] that optimizes considering first one, then two, three, etc., constraint equa¬ 
tions [1, 13]. 

(b) Orchard-Hays— Composite Simplex Algorithm. Uses artificial variables but 
instead of minimizing their sum as is usual in phase I, starts by minimizing the 
objective form. Results in a feasible solution to the dual with some infeasibilities 
remaining in the primal. The dual-simplex method is then applied [28]. 

(c) Dantzig, Ford, Fulkerson— Primal-Dual Algorithm. This is a generaliza¬ 
tion of the Ford-Fulkerson proposal for transportation problems [19]. Using a 
feasible solution to the dual (or a pseudo-solution to the dual), the infeasibility 
of the primal problem is minimized over a restricted set of variables whose cor¬ 
responding dual variables are zero. The dual variables are then adjusted and the 
process repeated until no infeasibility remains in the primal [15], at which point 
the solution is optimal. 

(d) Markowitz Maximum decrease of objective form , z , per decrease of infeasi¬ 
bility form, w . The proposal (unpublished) is to replace the usual criterion for 
phase I, which introduces into the basic set a variable x ,• such that dw/dxj < 0 
is minimal; instead, x 3 - is chosen such that dw/dxj < 0 and the ratio (dz/dxf)/ 
(dw/dxf) is maximal. 

The above proposals are applicable in general for any linear programming prob- 
lem; however, it is believed their use can at best cut down somewhat the number 
of iterations and perhaps make the difference between success and failure in the 
solution of a large multistage system. An intuitive suggestion peculiar to multi¬ 
stage systems will now be discussed: Consider a system of type (28) where the 
submatrices are repeated from period to period. The idea is to try to obtain in¬ 
ductively an optimum solution for a T = 1, 2, • • • period model. To solve a 
T + 1 Period model that maximizes some output (motors, in the example) one 
could first maximize a T period model. Next, translate the entire solution forward 
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one time period so that the activity levels of period t become those for t + 1. For 
the starting period £ = 0 start with a set of storage activities. The hope is that 
the assumed future activities will generate a set of prices that are representative 
of future actions and therefore a good guide for selection of activities for the first 
period. It is also hoped that there will be very little in the way of substitution 
of earlier type activities for later ones. To the extent that this is true, the ad¬ 
justments are like solving a one-period model. This approach was used success¬ 
fully on some tests with the motor-steel-tool model referred to earlier (27). 

We now turn to proposals for finding a compact form for the inverse of the 
basis. 

(a) The first proposal, due to Markowitz [27], is particularly applicable 
whenever the basis B = [6 ZJ ] is composed largely of zeros. Consider the linear system 

(31) ZbijXj = yi , i = 1, 2, * • •, m. 

Markowitz essentially mechanized a hand elimination procedure for solving such 
a system for x in terms of y : for the pivot element choose a column with as many 
zeros as possible. The selection of the next variable to be eliminated can be made 
the same way on the reduced system. [It is also possible to seek out rows with 
many zero entries and carry out certain transpose operations on the matrix.] The 
information recorded is (a) the operations performed and (6) the back solution. 
This results in the inverse of the matrix being represented as a product of ele¬ 
mentary matrices (i.e., matrices that are the same as identity matrices except 
for either one row or one column). These columns or rows, as a rule, have a large 
per cent of zeros. This technique has worked out well in practice. The inverse 
of the basis from iteration to iteration is maintained by multiplying it by addi¬ 
tional elementary matrices. After a number of iterations, however, the compact 
representation is lost and it is necessary to reinvert the basis from “scratch 55 to 
make it compact again. 

(b) The second proposal, due to the author, is designed for block triangular 
structures. It consists in taking a basis such as (30) that does not have square 
submatrices down the diagonal, and modifying a number of the columns. For 


(32) B = 
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example, suppose the second and fourth columns of (30) are replaced by unit 
vectors with unity in the third from last and in the last component, respectively. 
Then upon rearrangement of the columns, the original basis B has been replaced 
by a pseudo basis B as shown in (32). 

It is now possible to represent B~ l as the product of B~ l and a matrix which is 
an identity matrix except for two columns (in this case). The inverse of B is 
never developed explicitly; instead, only the inverses of submatrices down the 
diagonal are recorded and modified from one iteration to the next. This work is 
described in [11]. Alan Manne, W. Orchard-Hays, Ted Robacker, and the author 
have made extensive studies of other ways to transform the basis B into the 
form B for the Angular case. The above proposal is representative. It should be 
remarked that this approach is efficient only if the number of excess columns is rela¬ 
tively small. It is conjectured that this will be so for most block-triangular sys¬ 
tems encountered in practice and that it is worth while to build up a computing 
procedure along these lines. 

(c) Recently the author, jointly with Philip Wolfe, developed a new procedure 
that is particularly applicable to angular systems and multistage systems of the 
staircase type (1). This is reported in preliminary form in RAND P-1544 (No¬ 
vember 10, 1958) under the title, “A Decomposition Principle for Linear Pro¬ 
grams. The system consists of certain goods shared in common among several 
parts and certain goods (including facilities, raw materials) peculiar to each part. 
In short the system is angular in structure. 

Although the entire procedure is one intended to be carried out internally in 
an electronic computer it may also be viewed as a decentralized decision making 
Process . Each independent part initially offers a possible bill of goods (a vector 
of the common outputs and supporting inputs including outside costs) to a central 
coordinating agency. As a set these are mutually feasible with each other and 
the given common resources and demands from outside the system. The co¬ 
ordinator works out a system of “prices” for paying for each component of the 
vector plus a special subsidy for each part that just balances the cost. 

The management of each part then offers, based on these prices, a new feasible 
program for his part with lower cost without regard to whether it is feasible for the 
system as a whole. The coordinator, however, combines these new offers with the 
set of earlier offers so as to preserve mutual feasibility and consistency with 
exogeneous demand and supply and to minimize cost. Using the improved over-all 
solution he generates a revised set of prices, subsidies, and receives new offers. 
The essential idea is that old offers are never forgotten by the central agency 
(unless using “current” prices they are unprofitable); the former are mixed with 
the new offers to form new prices. 
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MANAGEMENT MODELS AND INDUSTRIAL 
APPLICATIONS OF LINEAR 
PROGRAMMING* 

A. CHARNES and W. W. COOPER 

Purdue University and Carnegie Institute of Technology 


Little progress in "activity analysis models,” at least as far as industrial 
applications are concerned, appears to have been made since Koopmans' 
original research [50d]. Perhaps this lack of progress has resulted from the 
rather extreme example used by Koopmans, a completely delegated, almost 
egalitarian model of an organization. Perhaps the orientation toward the 
general problems of an economic system, in the classical tradition, or Koop- 
mans’ lack of detailed attention to expedient computation devices account for 
the fact that inadequate attention has been devoted to the possible value of 
further developments for industrial applications. This situation should be 
remedied by devoting attention to the adaptations, modifications and exten¬ 
sions required to make these models suitable for industrial applications. 

It is true that Koopmans' formulation needs to be interpreted if it is to be 
brought within the framework of the more usual forms of linear programming. 
It is not one, but a series of linear programming problems. (See appendix.) 
The crux of Koopmans' formulation rests on the concept of efficiency prices. 
These prices, or their "accounting” counterparts, are intended as internal guides 
for a decentralized organization—analogous to, say, the so-called internal 
profit-and-loss control systems employed by many large commerical organiza¬ 
tions. 1 The objective is to supply price guides, including prices of fixed facilities, 
which can be used for bidding by the various departments both for services 
supplied within the firm itself and from outside sources. 

In order to see what is involved consider the system of inequalities in Table I. 
Each column of the Table represents an "activity” and each row a "commodity. ” 
The variables x indicate the levels at which the activities are to be run and the 
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Brenstock, Aaron Yuzow and Solomon Schwartz, Management in Russian Industry and 
Agriculture (London: Oxford University Press, 1944) describe a system used by the Soviet 
Government. 
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TABLE I 

Activity Analysis Model 


Activities 

Net 

Outputs 
and Inputs 

Stipu¬ 

lations 

x, .. 


X3 

Xt 

/ 




Y/ 

All 


-/ 



Yi 

mo 

-/ 

-/ 

/ 


Y* 

= 0 



-/ 

/ 

y* 

= 0 

-3 

-2 



y* 

s -12 

-s 




y< 

m-jo 

-/ 

-2 



yy 



values y represent the corresponding amounts of each commodity. The com¬ 
modities (goods or services) are divided into final, y x and y 2 ; intermediate, y z 
and 2 / 4 ; and primary products, 2 / 5 , t/ 6 , and 2 / 7 . These divisions correspond to the 
relations the variables, y, bear to the stipulations and to sign conventions used 
to distinguish between inputs and outputs. Final products (y x and y 2 ) are 
constrained to be non-negative, primary ones ( 2 / 5 , y Q , and y 7 ) are constrained to 
be non-positive and to conform to stipulated limits while intermediate products 
( 2/3 and 2 / 4 ) are zero. Within any column (in the body of the table) a negative 
sign attached to any coefficient designates an input to the activity and a positive 
sign an output. 

Koopmans’ organization model may be summarized as follows: Each com¬ 
modity (row) is placed in charge of a custodian and each activity (column) in 
charge of a manager . 2 Custodians and managers are each to maximize their 
own profits. The issue is whether it is possible for a central office committee— 
a helmsman in Koopmans’ terminology—to devise a system of prices, or price 
rules, which will guarantee certain results (not necessarily optimal) to the overall 
entity. As has already been indicated, only limited guarantees can be offered 
unless fuither intervention is allowed. Under certain circumstances efficiency 
can be achieved. Moreover, as Koopmans shows, by following specified rules of 
pricing both inside and “outside” transactions may be comprehended by 
these efficiency conditions. 

It is important to emphasize both the differences and similarities that exist 
between Koopmans and the linear programming approaches that have pre¬ 
viously been presented. In one interpretation an efficient program is only one 
that is not obviously wasteful. Thus a point y with coordinates 2/*, i = 1, 2 , • • *, 
is said to be efficient if and only if there does not exist a point 2 /, with coordi¬ 
nates 2 /i, i = 1 , 2 , • * ■ , which is better. The term “better” is used in the sense 
of a partial ordering: No coordinate of y is less than the corresponding coordinate 

2 This terminology is borrowed from Koopmans [50d], p. 93. 
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of y , and at least one coordinate is greater. Formally, y is better than y if 

(i) V > y, 

where “>” means Jgy and y. If such a point $ is available, then y is 
not efficient. 

There are, to be sure, many problems which need to be considered before 
introducing this concept into an industrial organization. It does, however, have 
two virtues. First, it focuses attention on the fact (not always recognized in 
currently employed internal pricing systems) that (a) the maximizing objectives 
of any particular supervisor 3 and of the overall entity may conflict and (b) the 
objectives of the various supervisors may also fail to coincide. In short, the 
improvement secured by one may worsen the position of others. Second, there 
are regions in which the improvement secured by any supervisor redounds 
not only to his own benefit but also to the benefit of the entity and, perhaps, 
other supervisors as well. Thus, if y is not efficient then it is possible for at 
least one custodian or manager (under properly conceived price rules) to improve 
his own position without worsening the lot of any other custodian or manager. 
The prices designed to produce the “correct” behavior under these circum¬ 
stances are the efficiency prices. 4 Although dynamics of these price arrange¬ 
ments have not been fully worked out, it is possible to ensure, under certain 
circumstances, that custodians and managers dealing with each other will not 
be foregoing benefits which they might otherwise obtain by dealing with sources 
outside the entity. 

Koopmans has made one start on problems which are important in cost alloca¬ 
tions as well as in organization theory. In dealing with the question of multiple 
objectives it was necessary for him to alter features of the usual linear pro¬ 
gramming model. The usual objective, scalar optimization of a single quantity 
(e.g. total profits or costs) is replaced by a problem in vector optimization. As 
will be shown in the appendix the activity analysis approach can be reconciled 
with linear programming. It replaces one linear programming problem by a 
series of such problems and their duals. 

It is conceivable that incorporation of hierarchical 5 and hierarchoid arrange¬ 
ments into the models of activity analysis may provide a start toward adjusting 
them for industrial applications. But the activity analysis approach is impor¬ 
tant in its own right. The relations between linear programming and zero-sum 
two-person games are well known. It is possible that the activity analysis 
approach may provide a similar bridge for other types of games as well. The 
so-called Pareto-Nash 6 equilibrium points in non-cooperative game solutions 

3 The term supervisor is here used to refer to both custodians and managers. 

4 A more general formulation would specify other kinds of information, or “misinforma - 
tion,” to be supplied as a means of correcting potential misbehavior by supervisors. This 
kin d of extension is being studied by the authors in collaboration with Martin Shubik of 
the General Electric Co. 

5 Vide [60] for a simplified example and further references. 

6 Cf. [16], [64] and [68]. 
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suggest an affinity and common origin with Koopmans’ approach. Extended 
versions of delegation models may also offer a means of dealing with some of the 
difficult problems of sub-optimization 7 that are often faced in applied work. 
In particular, it should be possible to evolve methods for imputing prices (ini¬ 
tially or finally) to omitted elements of the system. These possibilities alone 
would seem to warrant the further research required for industrial applications. 

Appendix 

Table I of the text was used to illustrate some of the constructs of activity 
analysis. A major purpose of this approach to programming concerns the analy¬ 
sis of rules which might be employed to guide the activities of an entity under 
a decentralized management regime. The objectives of each official in such an 
entity may assume a variety of forms relative to the objectives of other officials, 
and the entity itself. They may conflict or complement one another, or they 
may be entirely neutral. (The economic model of a free price economy illus¬ 
trates the possibilities.) The purpose of the rules (e.g., efficiency pricing) in 
activity analysis is to ensure that certain levels of attainment (e.g., efficient 
points) will be secured when each official is allowed to pursue his own objectives. 

How can multiple objectives, such as these, be restated in order to make this 
problem amenable to the methods of linear programming? The purpose of this 
a PP en dix is to reformulate the models of activity analysis so that this can be done 
and then to develop computational procedures for locating all efficient points and 
efficiency prices. 8 

Recall that the matrix A of coefficients 9 in any activity analysis model may 
be partitioned into three major sectors— viz., 

Ap, for final commodities 

Aj, for intermediate commodities 

Ap, for primary commodities. 

A vector y = (yj/, y/, yp) of “commodities ” can then be similarly partitioned 
and associated with each such array of coefficients, by a vector x, of activity 
levels, defined so that 

Ax = y 

( 3 ) 

x = Vf S 0, yi — 0, — yp g — tj p rj P g 0. 

Among the possible vectors y are subsets called efficient points. These points 
are distinguished by the property that it is not possible to improve any compo¬ 
nent of yp (within the limits allowed by the restriction) without worsening at 
least one (and possibly more) of the others. 

7 Vide [38J. 

8 Also called “shadow prices” and “accounting prices”. See Koopmans [50d], p. 65. 

8 E.g., Table I in the text. Such matrices are called “technology matrices.” Vide loc 
cit., p. 37. 



Ill—7—MANAGEMENT MODELS AND LINEAR PROGRAMMING 


95 


The following necessary and sufficient conditions of efficiency are established 
by Koopmans: 10 A vector y is efficient if and only if there exists a vector p 11 
with 

p' y = 0 

(4) p' A SO 

Vf > 0, pp= 0, pp > = 0, 

where pp and pp are prices of final and primary commodities, respectively. 
Pp= indicates those primary commodities (e.g., factors of production) which are 
utilized to capacity and pp > those which are not. 

In order to construct a class of special linear programming problems for locat¬ 
ing such y 1 s and p’ s, it is useful to introduce the concept of "antecedents” of 
efficient points and prices. These antecedents can be interpreted as activity 
vectors with special properties. They are here characterized as optimal solutions 
to dual linear programming problems so constructed that they can be associated 
with solutions to the problem in vector optimization stated by Koopmans. 
Means which are available for determining all optimum extreme point solutions 12 
to any linear programming problem can then be used to determine all efficient 
points. An easy extension provides solutions to the corresponding duals. Thus 
all efficient points and the corresponding efficiency prices can be readily ascer¬ 
tained by linear programming techniques. 

The sets of all antecedents of efficient points are unions of convex polyhedral 
sets. In general, such unions do not form convex sets. Also, since the linear image 
of a convex set is convex, the sets of all efficient points are unions of polyhedral 
convex sets and the same holds for the efficiency prices. 

The linear programming problem to be considered may now be stated in matrix 


form as 

max. v 0 ' Ap x 

subject to 

—Ap x S —vp 

(5) 

—Ax x == 0 

x ^ 0 


v° > 0 

with its dual 

min Wp'(—yp) 


10 Theorem 5.4.1, p. 82, loc. tit. 

11 This will be called a price vector, following Koopmans, although (as he shows) it may 
also be related to the concept of marginal rates of substitution. Cf. be. cit., pp. 66 ff. 

12 E.g., by the use of “Tarry data” for the labyrinth problem. See [9]. There is no loss 
of generality in confining attention to extreme point optima since all others can be secured 
from them. See [16]. 
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subject to 

( 6 ) or 
with 

Theorem: If 


v°'Af S —wpAp — w/Ai 

v°'Af + w/Ai + wpAp g 0, 
wp' ^ 0 and w/ unrestricted. 


(7) (<//, y/, yp f ) = (x*'A f ' } x+'A/, %*'A p) and x* 

is an optimum solution to (5) then y is necessarily an efficient point, and 

(8) p' = (p F ', p/, pp) = (v 0 ', wi*', w P *') 

the corresponding efficiency prices where, of course, wj* and wp* are parts of 
an optimal solution to the dual, (6). 

Proof: It is necessary only to show that these optimum solutions conform to 
the conditions (4). Constructive procedures for locating all such optima will 
then be supplied. These solutions provide all efficient points and prices. The 
condition p'A S 0 is, of course, equivalent to (6) since 

v°'Af + w/Ai + wpAp = p'A. 

Also 

(9) p'y = v°'Apx* + wr*'Aix* + Wp*'Apx* = v°'Apx* + wp*'r)p 

since Axx* = 0. Moreover, by the theorem of the alternative 13 (wp*) r = 0 
whenever — (A P x*) r < (— ijp)r- Hence (Apx*) k = ( i} P ) k , k r, so that 

wp*'Apx* = wp*'ijp. 

It therefore follows that 

(10) p'y = V°'A F X* + Wp*'tjp = w P * f {— tip) + w P *'rjp = 0 

since, by the dual theorem, v°'Apx* = wp*'(—i}p). 

The first two conditions in (4) are thus established. The remaining proper- 
ties, pp > 0, pp = ^ 0, and pp > = 0, on the price vector, are also obtained. 
The condition on pp is true by the assumptions on v°. The properties of wp* 
used in establishing (9) and (10) are precisely those exhibited in (4)— viz., 
(wp*) r = 0 whenever ( Apx*) r > ( t)p) t and ( wp*) k 0 for k 9 ^ r. 

The proof is therefore complete. Any y which has x* as its antecedent is 
efficient and p' = (v°\ wi*', wp*') is the corresponding vector of efficiency 
prices. 

To determine all efficient points and prices it is sufficient to program para¬ 
metrically 14 over the set of prices 


13 See [80]. 

14 Cf. [35]. 
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(ID £ (V°) r = 1 

r=l 

with 

(v°)r k e > 0, 

for e arbitrarily small. The x* and w* obtained by tracing out the labyrinthine 
path over all such extreme point optima provide the required efficiency prices 
and efficient points. 15 The procedure is as follows: Start with an optimal solu¬ 
tion to (v°) r — €. Next, parametrically vary v°, obtaining new optimal tableaus 
until such tableaus have been obtained for all 16 

(12) v° in|y° | (v°) r = 1, (fl°) r ^ e > o| • 

Then for each such optimal tableau develop the alternate basic optima (e.g., 
by the labyrinth traversal method) 17 noting the dual basic optima as well. 
When this procedure is completed the efficiency prices will be available along 
with the antecedents (extreme points) of the efficient points. 

Reverting to Table I of the text for an illustration, let v 0/ — (1, a), a. > 0, 
then the linear programming model to be used for this example of activity analy¬ 
sis is 


max. v 0f Apx = x\ + ax 2 + 0x 3 + 0x 4 

16 The perturbation procedure provided in [9] resolves all ambiguity with respect to de¬ 
generacy or alternate optima that may be encountered in these “wanderings.” 

16 In general a tableau will remain optimal for a complete convex subset of the v°. For, 
consider a basic solution x* associated with expression 

Pj = ^ ' ief P%yij‘ 

Since the solution is optimal, 

Z j ” C i Vij = c y* 

For present purposes cy > 0. Fixing the yij and allowing the Ck’s to be variable, then the 
inequalities 

'j ieT yij Cj ^ 0 

define the intersection of n halfspaces. This intersection is non-empty and convex. Its further 
intersection with 

Si C S = 1, e > 0 

is a bounded, closed convex set and, thus, a polyhedral convex set. 

It follows that the set 

n 

»° ^ (»°) r = 1, V T ^ « > 0 

r=l 

will be swept out in a finite number of parameter variations (each sufficient to induce a change 
from the previous basic solution) because: 

(1) Every optimal set corresponding to a v° vector contains at least one basic solution. 

(2) There are only a finite number of basic solutions. 

17 See [9]. 
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(13) 


—Aix = 0 : 


-Xi 


—x 2 + x 3 = 0 
— ^3 -f~ x 4 = 0 


— A P x g ~vp- 


3a:x -j- 2 x z 
5xi 

l x i + 2x 2 


g 12 
£ 10 
^ 9 


with * £ 0. For srnplex solutions, or variants thereof, 18 this is all that is really 
needed. The solutions x ^ 0 provide the extreme point antecedents of the cor¬ 
responding efficient y’s. Solutions to the dual are read from the z, — Cj row 
immediately under the slack vectors or their artifical counterparts. 19 The values 
wi , w P * and v 0 ' are the corresponding efficiency prices. By varying v 0 ' para¬ 
metrically , in the parameter a, all such solutions may be obtained. 

Inserting artificial and slack vectors, as required in (13), the arrangement 
shown m Table II-A is secured. 20 For 0 < a < § the tableau of II — B is a 
unique optimum with the activity levels x 1 — 2, x 2 — 3, x 3 = 5, x 4 = 5 
appearing under P 0 and the unit value appearing opposite S 5 (in the stub) 
indicating that one unit of slack is programmed on the receiving facility. These 
are the antecedents of the efficient point y with final commodities y x = 2, 
^2-3 and primary commodities y 5 = —12, y 6 = —10 and y 7 == —8. 21 
The corresponding efficiency prices may be obtained from v 0 ' and from the 
values for z -- Cj shown under I u P 4 , S 3 , S 4 and S 5 in Tableau II - B. 
Therefore, for any "prices” 22 1, 0 < a < f, established for the final products 
e corresponding prices on the intermediate ones are zero 23 while those for 
the primary commodities are a/2, * - and 0, respectively. 

a leau II B remains uniquely optimal until a = §. At this point an 
alternate optimum is apparent with S 4 in place of S 5 . The resulting substitution 
(indicated by the arrows in II - B) yields the alternate shown in II - C. 
ior f < a < 2 the latter is uniquely optimal with activity levels x x — §, x 2 = 

’ Xs an< ^ V nnits of slack on M 2 . The corresponding program is, 


18 See [12]. 
18 See [16]. 


sla!wt! t artificia o Ve0t0r c 18 7 V l veotor with non ' zero component in the first row. The 
°" Me Pi ~ Si an . d Ss ’ Si > Ss where the subscript indicates the row in which the 
non-zero component appears m these unit vectors. 

Koonmans Si “tT be altered ’ f desired - to d «®ote input-output relations as defined by 

thevdt o 50 ^' The , mtermedlate commodities y 3 and y, need not be written down since 
tney must always equal zero. 

22 Prices”, “net profits”, or other measures of relative desirability. If 1 « are nrofits 

how any r eS Which 7161(1 these net results may be used - When “outside trading’” is allowed, 
loc. at. CT pp h 9lff 1CeS mUSt be estabUshed with these Possibilities in mind. Vide Koopmans| 


23 I.e., the values for zy — cy shown under l x and P 4 in Table II-B. 
penalty rate, M, associated with the artificial vector, I 1} is ignored. 


For this purpose the 
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&:Optimum Tableau for o << H 



O Optimum Tableau for%*■<**2 



Oi Optimum Tableau ur<* >2 



TABLE II A-D 
Efficient Point Calculations 


final commodities: 2/1 = f, 2/2 = intermediate ones zero and primary ones 
y 5 = — 12 , y 6 = —I\ and 2/7 = —9. This new program is also efficient with 
final "prices” 1, f < a < 2, intermediate ones at zero and primary prices of 
4 — a/4, 0, and — \ + fa. 24 

When a = 2 an alternate optimum is again made available with S 3 in place 
of Pi, as shown in Tableau II — D. This Tableau completes the possibilities 
since it remains uniquely optimal for all a > 2. For all such cases = 0, 
x 2 = x 3 = x 4 = 4J. Thus, 2/1 — 0, 2/2 = 2/3 = 0, 2/4 = 0, 2/5 = —9, 

2/e = 0, 2/7 = —9. Hence the efficiency prices, for final commodities are 1, 
a on 2/1 and 2 / 2 , respectively; all intermediate and primary products, with the 
exception of 2 / 7 , receive imputed values of zero and 2/7 a price of a/2 for all 
a > 2. 

24 At a = f, or a =2, trouble is caused by the presence of alternate optima so that the 
price information may not be sufficient for guidance. 
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[ 10 ] 

[ 11 ] 

[ 12 ] 


[13] 


[14] 


[15] 

[16] 


This completes the exposition. Further understanding of what is involved and 
some useful additional information can be secured from the above tableaus. As 
has already been noted, these efficiency prices can be related to marginal produc¬ 
tivities and the marginal rates of substitution of economic theory. They can 
therefore aiso be brought to bear in indicating the levels at which the relevant 
substitutions will be made and thus used to establish sensitivity limits. 
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APPLICATIONS OF LINEAR PROGRAMMING IN THE 

OIL INDUSTRY * 1 

W. W. GARVIN 2 , H. W. CRANDALL 3 , J. B. JOHN 4 , and R. A. SPELLMANN 4 

This paper is the result of a survey made during the summer of 1956. It is a 
progress report on applications of linear programming by a number of oil com¬ 
panies. Examples are presented of applications to a variety of problems arising 
in the areas of Drilling and Production, Manufacturing, and Marketing and 
Distribution. The examples were selected to illustrate both the power and the 
limitations of present linear programming methods when applied to actual 
problems. 


1. Introduction 

Plans were made during early 1956 for a symposium on industrial application 
of linear programming to be presented at the Fall Meeting of the Institute. As 
the theme of that meeting was “A Progress Report” and some of the earliest 
applications of linear programming were made in the oil industry, it seemed 
fitting to include in the program a progress report on what the oil industry had 
been able to accomplish thus far in this field. 

We were requested by George Dantzig to present such a review and to include 
in it not only some of our applications but, if possible, applications by other oil 
companies as well. With this in mind, about a dozen major oil companies were 
contacted by us and were invited to contribute linear programming applications 
or studies they had made which were of general interest and of nonconfidential 
nature. The response was most encouraging. Because of limited time available 
arrangements were made to visit personnel of six oil companies for the purpose 
of discussing their work and ours in the linear programming field. 

The oil industry became aware of linear programming through the pioneering 
work of Charnes, Cooper, and Mellon (1952, 1954) and the work of Gifford Sy- 
monds (1953). We owe a great deal to these gentlemen and to Alan Manne (1956) 
for pointing out to us that linear programming has a place in our business. A 
few years ago, there were few people indeed in the oil industry who had ever 
heard of such things as “basic solution” or “convex set”. Today, these terms 
are much more familiar and as a result much less frightening to some. What is 
involved here is an educational process and educational processes are notoriously 
slow. It is amazing, therefore, to see how much has been done in such a compara¬ 
tively short time. 

* Received February, 1957. 

1 A version of this paper was presented at the Los Angeles Meeting of the Institute of 
Management Sciences, October 18, 1956. 

2 California Research Corporation, La Habra, California. 

3 Standard Oil Company of California, San Francisco, California. 

4 California Research Corporation, Richmond, California. 
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"' d “Prove.,, problems become more interwoven and 
i„r b ““ 0f the 0,1 industry ”0 “0 exception. They can logically 

shown Si™ , T°T T 10 ““ di£ferent pllases » f buemesa Z 
,• Figure 1. An integrated oil company must first of all carrv out exnlora 

to actmt.ee to detennin. the spots where oil is most Le y teT ,717 Z 

SedTiT*' " f - ^ “ eiPW ^ «" « 

vetott, st td , " ." lth us ’ we hit 0lL Additional wells are drilled to de- 
« t Se Tl Pr “ 1 T‘ 0n 8etS . Undenvay ' The oil is ‘“reported by various 
The products in tn T*" Tv? '' ,a r |e ‘- v °‘ Products are manufactured from it. 
ntkeLd W **" n& ‘' ,ry - mter *>“ distribution system and are 

Needless to say, each of the areas shown in Figure 1 is full of unanswered ones 

rreZtrzi'T "TTr* «• 

S ould they be combined for maximum effectiveness? An oil fiplH 

can be produced in many different ways. Which is breTZZplZ 5 a 
moderp refinery is staggering. What is the best operating planP^nd^yhat 

□ tadTST ^ “ be3t ” ? “ C ° Urae ' ” 0t *“ the pro “^ “ there Its 
to do is S n r I f pr0grammmg but some of them d0 - What we would like 
how Z ZTrm’ TrTTT™ LP yp« problems from each area, show 
obtained form ulated and in some cases, discuss the results that were 

UnWnn d i h ? Pe<110 fiQd applications in a11 f our of the areas shown in Figure 1 
Unfortunately, we were successful only in three We did ^ 

dential applications in the field of “ 

confidential phases of our business and it is for that reason that oil companies are 

very exphcit about their studies in this field. We can state however from 

vestigatio^ PenenCe ' ^ & ° f applications to exploration are under in- 
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Let us therefore turn our attention to the remaining three areas of Drilling 
and Production, Manufacturing, and Distribution and Marketing. Figure 2 
shows an outline of the applications that will be discussed. Out of the Drilling 
and Production area the problem of devising a model for a producing complex 
was selected. In the case of Manufacturing, the selection was difficult because his¬ 
torically this was the first area of application and much work has been done in 
this field. The problems shown were selected because they either illustrate an 
important concept or because they illustrate a peculiar twist in mathematical 
formulation. The problem of incremental product costs illustrates the technique 
of parametric programming and also shows what can happen if too many sim¬ 
plifications are introduced. The methods developed for handling tetra-ethyl lead 
and variable cut points illustrate how, under certain conditions, nonlinearities 
can be introduced into the system. The problem of cost coefficients will illustrate 
the need for realistic refinery costs. Finally, three problems out of the area of 
Distribution and Marketing were selected—a bulk plant distribution problem 
having to do with the shipment of products from refineries to bulk plants in an 
expanding market and the problem of devising long-range and short-range de¬ 
livery schedules from bulk plants to service stations. 

2. Model of a Producing Complex 

Let us now turn our attention to the first problem on the list—a model of a 
producing complex. We are indebted to the Field Research Laboratory of Mag¬ 
nolia Petroleum Company and to Arabian American Oil Company for con¬ 
tributing this application. This problem will be discussed in more detail in a forth¬ 
coming publication by A. S. Lee and J. S. Aronofsky of Magnolia. Consider N 
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oil fields or reservoirs (i — 1, 2 • • • N) as shown in Figure 3 which are producing 
at rates Q f (t) where t is the time. The total production of the N reservoirs is to be 
adjusted to meet a commitment Q c (t) (such as keeping a pipe line full or a 
refinery supplied). An outside source of crude oil is also available. Let the profit 
realizable per barrel be Ci (t ) and consider that the operation is to be run on this 
basis for a period of T years. Production limitations exist which require that the 
Qi(t) do not exceed certain values and that the pressures in the reservoirs do not 
fall below certain values. These limits may be functions of the time. We shall 
consider the case where these fields are relatively young so that development 
drilling activity will occur during the time period under consideration. The 
problem is to determine a schedule of Q { (t) such that the profit over T years is a 
maximum. 

By splitting up the period T into time intervals (k = 1 , 2 • • • K) and bringing 
in the physics of the problem, it can be shown that the condition that the field 
pressures are not to fall below certain minimum values assumes the form: 

k 

(.fi,k—i+l fi,k~j)Qij $ Pits — Pi min (1) 

for all i and k. The fa describe the characteristics of the fields and are known. 
The righthand side is the difference between the initial and the minimum permis¬ 
sible pressure of the Pth field. The variable is Q tj which is the production rate of 
the Pth field during the/th period. Additional constraints on the Q i} ’s are that 
the total production for any time period plus the crude oil possibly purchased 
from the outside source, Q, , be equal to the commitment for that time period: 

N 

Z Qu + Qj = Q ci , j = 1, 2 • • • K (2) 

Furthermore, production limitations exist such that: 

Qij ~ Q i/max (3) 

which are simple under bound constraints. The objective function expressing 
profit over the time period considered is: 



Outside source 


*Qc<t) 

Fig. 3 
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which completes the formulation of the linear programming problem. The coeffi¬ 
cients Cn and Cj are the profit per barrel of the i ’th reservoir at time j and cor¬ 
respondingly for purchased crude oil. 

Thus far, everything has been rather straightforward. But now, the time has 
come to clutter up the theory with facts. Let us take a closer look at the coeffi¬ 
cients ca . If we plot revenue vs a particular production rate Qij , we get a 
straight line passing through the origin as shown in Figure 4. Cost vs Qa is 
also more or less a straight line which, however, does not pass through the 
origin. The cost function is discontinuous at the origin, corresponding to a set-up 
charge such as building a road, a pipe line or harbor facilities or installing a gas¬ 
oil separator. It drops to zero when Qa = 0 because this corresponds to not yet 
developing the field. Also shown on Figure 4 is profit vs Qa which is the dif¬ 
ference between revenue and cost. The profit function thus is the straight line 
shown plus the origin. Hence, we can say that profit from Qa production is 
CijQn — Sij where s {j is zero if Q t y is zero and s*/ is a constant if Qa > 0. This 
is a particularly difficult constraint. No general methods are available for han¬ 
dling this except a cut-and-try approach. This type of fixed set-up charge con¬ 
straint occurs in many practical problems and we shall meet it again later on. 

One other complicating feature should be mentioned. Consider that during a 
certain time period, Qa was at level “A” as shown in Figure 4 and that in the 
succeeding time period Q { , m has dropped to level “B”. The profit at level “B” 
is not obtained by following the profit line to operating level “B” but rather by 
following a line as shown which is parallel to the revenue line. The reduction 
in level from “A” to “B” involves merely turning a few valves and essentially 
does not entail any reduction in operating costs. If, on the other hand, we go 
from “A” to “C” in succeeding time periods, then we do follow the profit line 
because an increase in production necessitates drilling additional wells assuming 
that all the wells at “A” are producing at maximum economic capacity. If we 
should go from “A” to “B” to “C” in succeeding time periods and if “A” was 
the maximum field development up to that time, then in going from “B” to 
“C” we would follow the broken path as shown in Figure 4. 

This state of affairs can be handled by building the concept of “production 
capacity” into the model and requiring that production capacity never decreases 
with time. But this can be done only at the expense of enlarging the system ap¬ 
preciably. 



Fig. 4 
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There exist other factors and additional constraints which must be taken into 
account. As is so often the case, we are dealing here with a system which oq the 
surface looks rather simple but which becomes considerably more complex 
as we get deeper into it to make it more realistic. Nevertheless, the simple 
system or modest extensions of it enables an entire producing complex to be 
studied thus providing a good basis upon which to build more realistic models. 

3. Incremental Product Costs 

Let us now leave the problems of petroleum production behind us and venture 
into the petroleum refinery. As was indicated before, a great deal of work has been 
done in this area. The few problems we shall discuss will be illustrative of what is 
going on in this field. 

We shall consider at first a simple but nevertheless instructive example. We 
are indebted to Atlantic Refining Company for contributing this application 
(Birkhahn, Ramser and Wrigley, 1956). A refinery produces gasoline, furnace 
oil and other products as shown in Figure 5. The refinery can be supplied with a 
fairly large number of crude oils. The available crude oils have different proper¬ 
ties and yield different volumes of finished products. Some of these crudes must 
be refined because of long-term minimum volume commitments or because of 
requirements for specialty products. These crudes are considered fixed and yield 
gasoline and furnace oil volumes Vo and V r respectively. From the remaining 
crudes and from those crudes which are available in volumes greater than their 
minimum volume commitment must be selected those which can supply the 
required products most economically. These are the incremental crudes. Denote 
the gasoline and furnace oil volumes which result from the incremental crudes 
by AF 0 and AV r and the total volumes (fixed plus incremental) by V GT and V, T . 
The problem is to determine the minimum incremental cost of furnace oil as a 
function of incremental furnace oil production keeping gasoline production and 
general refinery operations fixed. 

The formulation of this problem is straightforward: 

N 

dQi Vi — Vqt — Vg — AVo ( 5 ) 

N 

2 a Fi Vi = V FT — Vf = AV F ( 6 ) 
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Vi g F im;ix (7) 

N 

^ d Vi = min (8) 

x 

where a oi and a Fi are the gasoline and furnace oil yields of the f’th crude, Vi 
and Vi m ax are the volume and availability of the z’th incremental crude and c* 
is the cost of producing incremental gasoline plus incremental furnace oil per 
barrel of the i’th crude. This cost is made up of the cost of crude at the refinery, 
the incremental processing costs and a credit for the by-products produced at 
the same time. 

The procedure now consists of assuming a value for AV F and obtaining an op¬ 
timal solution. The shadow price of equation (6) will then be equal to the incre¬ 
mental cost of furnace oil because it represents the change in the functional 
corresponding to a change of one barrel in AV F . The incremental cost thus ob¬ 
tained, however, is valid only over ranges of variation of AV F which are suffi¬ 
ciently small so that the optimum solution remains feasible. Beyond that per¬ 
missible range of AV F the basis must be changed with a resulting change in the 
shadow price. For problems of this type, the so-called “parametric programming” 
procedure can be used. This procedure has been incorporated into the IBM 704 
LP code. It starts with an optimal solution and then varies in an arbitrary but 
preassigned manner the constants on the right-hand side until one of the basic 
variables becomes zero. The computer then prints out the optimal solution 
which exists at that time, changes the basis to an adjacent extreme point which 
is also optimum and repeats this process until a termination is reached. 

An actual problem was run with the model shown on Figure 5. Thirteen 
incremental crudes were available and incremental gasoline production was 
fixed at 14,600 barrels daily. The results are shown in Figure 6 which shows the 
minimum total incremental cost as a function of incremental furnace oil produc¬ 
tion. Ignore the dashed line for the moment. The circles represent points at 
which the optimum basis had to be changed. The functional is a straight line be¬ 
tween these points'. It turned out that incremental furnace-oil production w^as 
possible only in the range from about 7100 bpd to about 11200 pbd. Between the 
two extremes, the functional exhibits a minimum at about 8000 bpd. The reason 
for the minimum is to be found in the fact that near the two extremes of furnace 
oil production, little choice exists in the composition of the crude slate. Volume 
is the limitation and economics plays a secondary part. Away from the two ex¬ 
tremes, however, we have greater flexibility in crudes run and thus have the 
freedom to pick the cheapest crude combination. Figure 7 shows the incremental 
cost of furnace oil as a function of furnace oil production. It is a staircase type 
function because the shadow price remains unchanged as long as the optimum 
basis remains feasible and jumps discontinuously whenever the basis is changed. 
At low levels of incremental furnace oil production, the incremental cost becomes 
negative because in that region it is more expensive to make less furnace oil. 

If we now were to show our model and our results to the refiner, he would im¬ 
mediately detect a fly in the ointment. The negative incremental cost at low fur- 
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nace oil production runs counter to his intuitive feeling for the problem He 

Z d C POmt ° Ut ’ ^ ngh . tly S0 ’ that the f ° r mulation of our model is not com- 
p ete. Common sense would dictate the making of the larger volumes of furnace 
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example, this excess can be mixed into heavy fuel production. If all the heavy 
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fuel that is made can be sold, the net cost of the furnace oil over-production would 
be the negative of the value of heavy fuel indicating a credit we receive for in¬ 
creasing heavy fuel production. 

We are tempted, therefore, to try the formulation shown in Figure 8 where we 
permit the diversion of some furnace oil to heavy fuel. The equation for gasoline 
production remains unchanged but the furnace oil equation now reads: 

iairVi ~ S 1 = AVr (9) 

1 

and the objective form is: 

N 

22 CiVi — VhfSi = min (10) 

i 

where Si is a slack variable indicating the volume of furnace oil diverted to heavy 
fuel and v H f is the value per barrel of heavy fuel. It is not possible, however, to 
divert unlimited amounts of furnace oil into heavy fuel without violating heavy 
fuel’s specifications. The upper limit on how much furnace oil can be mixed into 
heavy fuel depends on the volume of heavy fuel produced which in turn is re¬ 
lated to the crude slate, and would depend also on the specifications of heavy fuel. 
Furthermore, if we bring heavy fuel into the picture explicitly, the cost coeffi¬ 
cients used before must be modified. The problem is beginning to become more 
complex. To take these effects into account would form the basis of an entirely 
new study. For purposes of the present illustration, however, the situation can be 
handled roughly as follows. It turns out from experience and by considering the 
volumes involved that the excess furnace oil production should be less than or at 
most equal to about 15 per cent of the incremental furnace oil production if all 
the excess is to go to heavy fuel and specifications on heavy fuel are to be met. 
Therefore, the additional constraint 

T i a i wVi + s 2 = 1.15A7, (11) 

i 

was added to the system where $2 is a slack variable. This constraint insures that 
no undue advantage is taken of the freedom introduced by excess furnace oil 
production. 

The results for this second formulation of the problem are shown by the dashed 
lines in Figures 6 and 7. The abscissa now refers to that part of incremental fur- 
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nace oil production which leaves the refinery as furnace oil. Excess furnace oil 
is produced below incremental furnace oil production of about 8600 bpd. Above 
that level, it is not economic to produce more furnace oil than required and con- 
sequently, there is no difference between the two formulations of the problem. 
Constraint (11) is limiting for incremental furnace oil production below about 
7500 bpd. Figure 9 shows the composition of the optimum crude slate for the 
second formulation as a function of incremental furnace oil production. This 
is useful information to have on hand. Note that no changes occur in the range 
of incremental furnace oil production from 7500 to 8600 bpd. In this range, ac¬ 
tual incremental furnace oil production remains fixed at 8600 bpd with any ex¬ 
cess going into heavy fuel. 

The modem refinery is a complicated system with strong interdependence 
among the activities within it. The example just described illustrates this point 
and shows the importance of the refiners experience in correctly isolating por¬ 
tions of the refinery which can be separately considered. 

4. Nonlinear Effect of Tetra-ethyl Lead 

The next two applications are concerned with partially nonlinear systems. 
One of the most common types of nonlinearity encountered in refinery operations 
is connected with the effect of tetra-ethyl lead (TEL). TEL is added to gasoline 
to increase the gasoline’s octane number. The increase in octane number, how¬ 
ever, is not a linear function of the TEL concentration. The first cc of TEL has 
a pronounced effect on octane number, the second cc, however, has a smaller 
effect, and for the third cc the effect will be still smaller. The maximum concen¬ 
tration permitted in motor gasoline is 3 cc per gallon. 
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A great deal of work has been done in the past few years on gasoline blending 
by linear programming. The problem is to blend the different stocks coming out 
of the refinery into gasolines having specification properties and to do it at mini¬ 
mum cost. In addition to octane number, other properties such as vapor pressure 
and various distillation points must be considered. All the important properties 
blend linearly on a volume basis except for the effect of TEL. To get around the 
TEL difficulty, it was usually assumed in setting up the linear programming 
model that the gasoline was shipped out at maximum TEL level of 3 cc per gallon 
or the TEL level was arbitrarily set at some lower value. In any event, TEL did 
not enter the system as a variable and thus was not permitted to seek its own 
level as determine^ by minimum cost. To get a feeling for the order of magnitude 
of money involved here, consider an average TEL concentration of 2 cc per gal¬ 
lon. At a price of TEL of about $2 per liter, a TEL bill of about $180,000 results 
for each million barrels of gasoline produced. Many companies produce of the 
order of tens of millions of barrels per year. Thus, even a reduction of only a few 
per cent in lead concentration begins to look big when translated into money 
savings. 

Consider now the general blending problem shown in Figure 10. The streams 
coming out* of the refinery are split three ways—to Premium grade gasoline, to 
Regular grade gasoline or to temporary storage. Additional stocks may be pur¬ 
chased from outside sources to go into gasoline. TEL is one such stock. The 
gasoline blends must satisfy a variety of quality specifications such as vapor 
pressure, distillation points and octane number. 

In setting up this problem in linear programming language, we have first of all 
the usual types of linear constraints which relate the properties of the stocks 
and the fraction of their volumes to the desired properties of the blended gaso¬ 
line. There is no difficulty here until we get to the octane condition. The relation 
we have is that: 


E* 


+ A ON S; ON, 


( 12 ) 


where ON c % is the “clear” octane number of the i’th. stock (its octane number 
with no TEL in the stock), V { is its volume, ON* is the specified minimum octane 
number and A ON is the octane increase due to lead. The first term on the left 
represents the “clear” octane number of the blend under the assumption of linear 
blending. Actually, clear octane numbers do not always blend linearly, but by 
using so-called “blending” octane numbers instead of actual ones, a sufficiently 
close linear approximation can be obtained. 

Let us now take a closer look at the AON term. If, for a specified octane number 
of the blend, we plot the difference between the clear octane number of the blend 
and the specified octane number as a function of TEL concentration required to 
bring the blend up to specification, we obtain a family of curves as shown in 
Figure 11 where the parameter is a characteristic called “lead susceptibility”. 
It is a measure of the ability of the blend to respond to TEL. Lead susceptibility 
can be considered to blend linearly with respect to volume. The curves are con¬ 
cave because of the saturation effects previously mentioned. 
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Fig. 11 

From past experience, it is usually possible to estimate within reasonable 
limits what the lead susceptibility of the blend is going to be. We can then con¬ 
struct curves as shown in Figure 11 for the estimated lead susceptibility and for 
susceptibilities deviating from that value by, say, ±10 per cent. Data are avail¬ 
able to do this. Let us now imagine that we split up the curves into, say, five bands 
as shown such that the curves within each band can be approximated by parallel 
and equidistant straight lines. This will always be posible by considering a 
sufficiently large number of bands and a sufficiently small range in lead sus¬ 
ceptibility. The situation shown in Figure 11 was considered sufficiently accurate 
by us for our purposes. The bounding lines between bands are not required to 
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be parallel. The bands can be interpreted in the following mann er. Instead of 
having only one type of TEL, we have, in this case, five fictitious types—TEL 1 
through TEL 5. Each band corresponds to one type of lead. These fictitious leads 
have two important properties: they do not saturate, i. e., their effect on octane 
number is linearly related to the amount of each lead present, and the effect of 
lead susceptibility on AON is independent of the TEL concentration. They are 
not all equally effective, however, as far as increasing the octane number is 
concerned. In view of the concavity of the function, lead 1 is much more effective 
than lead 5. We can thus write: 


A ON = a + b 


E & Vi 

Ef< 


+ 


E m s L s 
E v< 


(13) 


where Si is the lead susceptibility of the t’th stock, Ly is the amount of lead of 
type j present in the blend and a, b, and my are constants determined from the 
curves. Also, we know that: 


rrt j + 1 < my . 


(14) 


To insure that the fairy tale of the fictitious leads corresponds to reality, we must 
impose availability restrictions on the L/s for otherwise we would satisfy the 
octane restrictions with L\ because octane wise it is cheapest and, as a result, 
get way off the curve. As the straight lines within each band are equidistant, the 
maximum amount of each lead that can be put into the blend can be represented 
as a linear function of susceptibility, corresponding to the bounding straight lines 
between the bands. Hence, we can write: 


L, 

EF, 


=i dj + 6/ 


E Si Vi 
E Ft 


(15) 


where dj and e, again are constants determined from the curves. Substituting 
the expression for A ON into equation (12) and multiplying through by , we 
obtain a system of linear relations which can be incorporated in the over-all 
linear programming formulation. 

We are not yet quite through, however. Each grade of gasoline has two octane 
requirements which are called the F-l and F-2 octane specifications. As we have 
two gasoline grades, this means that we have four octane specifications that must 
be met. Therefore, we have in reality four families of curves similar in shape to 
those shown in Figure 11, and all four must be represented by the procedure just 
discussed and added to the system. We must also distinguish not only among 
different TEL types but also between TEL going into Premium or Regular to 
meet the F-l or F-2 octane requirement. If five fictitious leads are used for each 
gasoline grade and each octane, then we have a total of 20 fictitious leads which, 
as far as the matrix is concerned, are separate activities. From a physical point 
of view, we must impose two additional constraints because the fictitious leads 
are not completely independent. The total amount of TEL used in Premium 
to meet or exceed F-l must be the same as the total amount of TEL used in 
Premium to meet or exceed F-2 because these two leads are physically identical. 
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They were separated in the matrix for mathematical reasons only. The same 
type of constraint applies to Regular. Hence, we must stipulate: 


^2 Lj Premium F -1 — I// premium F-% 

^L/ -bj Regular P-1 — 2222 Z// Regular P-2 

Finally, the objective will be of the form: 


(16) 

(17) 


('L X) (Z/y Premium ”f" Z/y Regular) “f" * * * — min (18) 

where c L is the unit cost of TEL and the dots indicate other terms whatever they 
may be. 

It is clear that the optimum solution will make physical sense only if the fic¬ 
titious leads for the limiting octanes are involved in the solution in a physically 
realizable way. Consider the situation where, let us say, lead 1 and 2 are at their 
upper bound, while lead 3 and 4 deviate from their upper bound and lead 5 is zero. 
This is not a physically realizable situation because of the gap existing between 
lead 3 and 4. This, however, could never occur in an optimal solution because of 
the concavity of the TEL response curve and because we are aiming to use as 
little TEL as possible. If a gap exists between Z/y and Z/y+i, it will always be more 
economic to reduce the level of Ly+i and push Ly up to its upper bound because 
Z/y is more effective octane-wise than Z/y + i. Therefore, we have the assurance that 
the fictitious leads for the limiting octanes always will be involved in the optimal 
solution in a physically realizable way. This will not necessarily happen, however, 
for those leads which belong to the nonlimiting octane specifications. As we have 
two octane specifications for each grade, there will in general be one octane in 
each grade which is limiting while there is give-away on the other two. The com¬ 
puter will have no incentive to meet or exceed physically realizably the octane 
rating for which there is give-away. It cannot make any money by it because 
the total amount of TEL already is fixed by the octane rating which is limiting as 
required by constraints (16) and (17). The computer simply picks that octane 
which is limiting, works on it to meet it most economically and lets the chips fall 
where they may, as far as the other octane rating is concerned. The optimal solu¬ 
tion will still be perfectly satisfactory because the exact value of the give-away 
for the nonlimiting octane does not affect the solution. 

Let us now briefly discuss the results for a case where this approach to the TEL 
problem was tried. The data were based on an actual situation that existed in one 
of our refineries a few years ago. In this case, gasoline production was fixed at a 
given level. The objective was to minimize cost of TEL minus credit for excess 
stocks. Two solutions of the same problem were available to which the linear 
programming solution could be compared. One was the solution that was actually 
used which was calculated in the refinery at the time when the problem arose. 
This solution, as is often the case, was prepared under severe time limitations. 
The availability of new blend stocks added to the difficulty of the problem. The 
other solution to the identical situation was obtained later by allowing sufficient 
time for a thorough analysis of the problem. In both these solutions, conven¬ 
tional hand blending procedures were used. Table I gives a comparison of TEL 
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levels between these two solutions and the one obtained by linear programming. 
The reduction in TEL is clearly evident. The solutions should, of course, not be 
compared merely on the basis of lead savings. As can be seen from the objective 
function, the credit for excess stocks must also be considered. The net savings of 
the linear programming solution still were substantial. 

5. Nonlinear Effects of Variable Cut-Points 

In the blending problem just discussed, the volumes and properties of the 
stocks coming out of the refinery were given and the problem was to blend these 
stocks to make certain end-products in the most economical way. The refinery 
as a whole was fixed and the optimum blending solution gave us little or no in¬ 
formation about what the optimum refinery operation should be. This, of course, 
is a tremendous problem because the refinery abounds with nonlinearities and all 
types of mathematically peculiar constraints. One interesting step toward the 
over-all refinery optimization was discussed recently by Schrage (1956) where 
linear programming was combined with the method of steepest ascents. 

The next application we would like to discuss is an attempt to reach back into 
the refinery just a little way and optimize with respect to gasoline blending a few 
of the operating conditions. The conditions we shall consider are the re-run still 
cut points. A re-run still is a unit within the refinery which separates a stock into 
light and heavy components. The operating temperature of the unit determines 
the “cut point” between the two components. The volumes and the properties 
of the “cuts” are nonlinear functions of the cut point. The cut point can be 
varied within limits and the question arises as to where the optimum cut point 
should be for any given gasoline blending situation. 

One way of handling this problem is to introduce fictitious stocks as shown 
symbolically in Figure 12. We assume that instead of having only one cut point 
we have, say, three cut points corresponding to temperatures Ti , T 2 , and T z 


TABLE I 

TEL Content in cc/gal. of Blends 
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which yield small fictitious cuts “B” and “C” and major segregations “A” and 
“D”. The volumes and properties of the fictitious cuts are determined such that 
when they are combined linearly with the major segregations, correct volumes 
and properties result. Consequently, the fictitious cuts sometimes have abnormal 
properties when considered by themselves. 

The major segregations and the fictitious cuts are now made available to the 
gasoline blend just as if they were actual stocks coming out of the refinery The 
resulting optimal solution then is examined to see what happened to the fictitious 
cuts “B” and “C” in the shuffle. A number of things can occur as shown in Figure 
13. 

Because of the natural variation in properties with distillation temperature of 
the stock, it usually happens that in the optimum solution “A” goes entirely to 
Premium and “D” goes entirely to Regular. If “B” goes to Premium and “C” 
goes to Regular, we can conclude that the cut point should be at T 2 . If both 
“B” and “C” go to Premium, the cut point should be at T 3 or at a higher tem¬ 
perature; of they both go to Regular, it should be at Ti or at a lower tempera¬ 
ture. There is nothing in the program that prevents “B” and “C” from splitting. 
If “B” splits and “C” goes to Regular, the cut point should be between T t and 
Ti . If “C” splits and “B” goes to Premium, the cut point should be between T 2 
and T 3 . These five situations are the normal ones encountered most of the time 
because of the normal progression to higher sulfur content and lower octane as 
the cuts get heavier. Occasionally, however, it may happen that both “B” and 
“C” split in such a way that the fraction of “C” going into Premium is greater 
than the corresponding fraction of “B” or that “B” goes entirely to Regular 
and “C” goes entirely to Premium. These are situations which are not realizable 
in practice because we have only one cut point in reality. To prevent such situa¬ 
tions from occurring, additional constraints are imposed on the system which 
stipulate that the percentage of “B” going into Premium should be greater than 
the corresponding percentage of “C”. These constraints will insure that the opti¬ 
mal solution will be physically realizable without too much trouble. 


T 

1 



Fig. 13 
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6. Cost Coefficients 

Before leaving the field of refining, let us consider the effect of cost coefficients 
on optimum gasoline blending. If the objective is an economic one, costs or values 
have to be determined for some of the stocks that are produced. This can be a 
complicated problem. In the case of the blending example discussed previously, 
the objective was to mi n i m ize lead costs minus credit for the excess stocks. This 
meant that a value had to be determined for each stock which was not required 
to be used up. The situation was complicated further by the fact that some ex¬ 
cess stocks were earmarked for shipment to another refinery. This meant that 
their values had to be the values to that refinery which in turn depended on the 
local situation existing there during the time period of interest. These costs can 
be determined but they must be realistic for the solution to have meaning and a 
great deal of judgment and experience should go into their making. 

As an illustration of the effect of the cost coefficients on the optimal solution, 
consider a hypothetical blending problem where the volumes of Premium and 
Regular are allowed to vary but their ratio is fixed. The objective was to maxi¬ 
mize value of gasoline plus value of excess stocks minus TEL cost while meeting 
full quality requirements. Two cases were run which were identical in all respects 
except that in the second case the unit values of Premium and Regular were 
increased by a small amount. Stocks A through L were available for blending. The 
results are shown on Figures 14 and 15 where the composition, volume, and TEL 
content of the gasolines are compared for the two cases. As expected, the optimum 
gasoline production increased for case 2. Changes in composition also occurred. 
As far as Pre m iu m is concerned, it contains more B than before and contains C 
which did not enter Premium for case 1. Regular loses its B content and part of C 
and absorbs F which was not utilized at all in case 1. The gross effect of the 
change in the price structure is a shift of all of Regular’s B and part of its C to 


Premium 



Fig. 14 


Case Z 
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Fig. 15 


Premium and extensive utilization of F in Regular. As the change in gasolii 
value was not drastic, it can be seen that we are dealing 'here with a system 
which is rather sensitive to the price structure. 

7. Distribution to an Expanding Market 

Leaving the refinery with all its problems behind us we shall now turn to the 
area of marketing and distribution which has problems of its own. The classic 
example of a problem in this area is the transportation problem. A great deal of 
work has been done on this, particularly by oil companies. The first application of 
linear progr amm i n g that we would like to discuss in this area is a type of trans¬ 
portation problem which, however, has some complicating features. We are in¬ 
debted to Atlantic Refining Company for contributing this application. 

Consider m refineries (* = 1, 2 • • • m) and n bulk terminals or distribution 
centers (j = 1,2 n) as shown in Figure 16. At the present time, the refineries 
are producing at levels P { and the demands at the bulk plants are D } . We may 
consider the sum of the P.’s to be equal to the sum of the D,’s so that all the 
demands are met. Assume now that we find ourselves in an expanding market. 
Projections are available for what the demand at the different bulk plants is 
going to be, say, five years from now. Denote these projected demands by £>'. 
To try to meet the increased demands, we must expand refining capacity. Denote 
the increased production by P, + e< where e< is a variable denoting the amount of 
expansion. We must also increase the capacity of our bulk plants. The expansion 
of refining and bulk plant capacity costs money and an upper bound exists on 
how much can be spent on over-all expansion. This upper bound is such that it is 
impossible to meet the demand at all the distribution centers. The problem now is 
to determine which refinery and bulk terminal to expand, and by how much, so 
as to maximize the net return. The maximization of net return may not neces¬ 
sarily be the best objective but we shall use it here for purposes of illustration. 
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P,+e, R>+e 2 P m+Zm. 

I ! t 



This problem can be formulated as follows. The total production leaving the 
i’th refinery must be equal to the old production plus the expansion. Hence: 

n 

22 %ij = Pi + ei, i = 1, 2, * • •, m (19) 

y-1 

where is the amount shipped from i to j. The amount received at the f th 
bulk plant must be less than or can at most be equal to the projected demand in 
that area. Hence: 

it Xij ^ D/, j - 1 , 2, • • • , n (20) 

t-1 

If c iR is the unit cost of expanding refinery capacity at i, then the total cost of 
refinery expansion is 22 c <& e i • In. considering the cost of bulk plant expansion, 
we must take account of the fact that the shipments to some bulk plants may 
actually be reduced while others expand so as to be able to take full advantage of 
shifts in the market with the limited amount of expansion capital. The expansion 
of a bulk plant does require capital but a “contraction” does not because it 
simply means that shipments to the bulk plant are reduced. To handle this situa¬ 
tion, we add the relation 

m 

H — Dj = Sj + — Sj~, j = 1,2, ■■■ ,n (21) 

to the system where and sf are non-negative variables. We also stipulate 
that: 

m n 

22 Cm e t * + 22 CjB Sj + rg M (22) 

i-i i—i 

where Cj B is the unit cost of bulk plant expansion and M is the maximum ex¬ 
pansion capital available. The term on the left of (21) is the difference between 
the new shipments to j and the old shipments. If this difference is positive, then 
j expands, if it is negative then j “contracts”. It can be shown that either Sj + 
or sf~ but not both will be involved in the optimum basis. Hence, if j expands, 
then Sj + will be in the basis and there will be an expansion cost in view of the 
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last constraint. If j “contracts”, then Sj will be in the basis and there will be no 
expansion cost. 

Finally, the objective function is: 

w* Jx m n 

'll 2 CijXij — E Cm e, — E Cj B s* = max (23) 

*—i y—i i—i y—i 

where c f3 - is the profit per barrel shipped from i to j. 

This formulation is satisfactory as long as the new shipments to the “con¬ 
tracted” bulk terminals do not fall below a certain value (they may even go to 
zero). This is a situation which is analogous to the one encountered in discussing 
the model of a producing complex. The plot of profit at the j’th bulk plant as a 
function of shipments to the bulk plant is again a straight line displaced from the 
origin because of a fixed overhead. The actual profit function is again the straight 
line plus the origin. Thus, our objective function should really be: 

ffx n 

E Cm e,- — E Cj B Sj + = max (24) 

Jss=l 


0 if E Xij = 0 

*i = I v* (25) 

const, if > 0 

„ % 

but no general method exists for handling situations of this type. However, if it 
turns out in the optimal solution that none of the bulk plant volumes contract by 
substantial amounts the solutions will be useful. 

8. Service Station Deliveries—Long Range 

Having considered the link of refinery to bulk terminal, let us now consider 
the last link in the chain the flow of products from bulk terminal to service 
stations. Consider the situation shown in Figure 17. We are given the location of 
service stations and the roads connecting them. The small circles are the service 
stations, while the large circle denotes the bulk plant which supplies them by 
truck. Each service station, /;, requires a delivery of D k gallons of gasoline (for 
simplicity, let us assume only one grade of gasoline). Different truck types, de¬ 
noted by the index s, are available for making the deliveries. The trucks differ 
in regard to carrying capacity and operating characteristics. We have a number 
of trucks of each type available for the operation. The problem is to devise a 
delivery schedule such that the transportation cost is minimized. 

We are actually dealing here with two different types of problems depending 
on whether we look at this operation from the long-range or the short-range point 
of view. Let us consider the long-range point of view at first. 

Assume that we look at this operation over an extended period of time so that 
the D k represent total demands at the service stations during the period under 
consideration. Assume, furthermore, that the ratios of the D k ’s to the gallon 
capacity of each of the trucks is sufficiently large so that many deliveries have to 


EE* 

i-i v=i 




where: 
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Fig. 17 


be made to each station during that period in order to meet the demand. Under 
these conditions, the problem becomes a transportation type problem with trans¬ 
shipment of goods. The trans-shipment feature comes about through the fact 
that if a truck leaves the bulk plant and makes deliveries to, say, service stations 
1, 2, and 3 in that order, then the gasoline destine for station 2 is trans-shipped 
via station 1 and the gasoline for station 3 is trans-shipped via stations 1 and 2. 

Some work has been done on the trans-shipment problem, (Manne, 1954, 
Kalaba and Juncosa, 1956, Dwyer and Galler, 1956, Orden, 1956) in connection 
with aircraft scheduling and communication networks. Our problem here is 
slightly different but the general approach is the same. The key to the mathe¬ 
matical formulation lies in the use of triple indices. Adopt the convention that 
the first index refers to the point of departure, the second index to the inter¬ 
mediate destination and the third index to the ultimate destination. If 
denotes the number of gallons shipped from t 'to j destined for k, then we can 
write: 

J2 Vijk = X) Viuk , all j, k but j 9* k (26) 

i u 

H yak = D k , all A (27) 

i 

£ X) yo,k = 21 D k (28) 

* 3 

The left side of (26) is the sum of what arrives at j from all points but destined 
for k while the right side is the sum of what leaves j for all points destined for k. 
These two must be equal because we do not wish to accumulate anything at j 
destined for k. Equation (27) states that the sum of what arrives at k from all 
points and is destined for k must be equal to the demand at k. Equation (28) 
states that the sum of what leaves the bulk plant (indicated by the index zero) 
must be equal to the total demand. These three conditions insure that we deliver 
the proper number of gallons where they are required and that they all originate 
at the bulk plant. 
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If Xij g is the number of truck runs per period from i to j in the s-type truck 
(the index s denotes the type of truck and not ultimate destination), then we 
must also specify that: 

22 Xij, = 22 Xju ., all j, s (29) 

i v 

which means that the number of s-type trucks which arrive at j is equal to the 
number of s-type trucks that leave j. 

To insure that we have enough carrying capacity available for each i-j route, 
we must stipulate that: ’ 

22 TJnk S 22 Q, , all z, j (30) 

where g. is the carrying capacity of the s-type truck. The left side represents 
the actual number of gallons of gasoline that are hauled from z to j, while the 
right side represents the maximum number that can be hauled. The slack in this 
relation is indicative of the fact that the trucks may have to run partially full 
or empty some of the time. 

If we denote the time required by the s-type truck to go from z to j by h ijx 
and let h, be the time that an s-type truck can be used per period, then we have 
the additional constraint: 


h,, alls 

* 3 


(31) 


which insures that the trucks are not run longer than possible during the time 
period under consideration. 

The objective is: 


22 22 22 Cij, Xu, = min (32) 

i 3 s K 

where c,y, is the cost per trip of operating an s-type truck over the link i-j. 
Having determined the re's and y’s a schedule can then be constructed from them. 

For any actual problem, the number of constraints represented by equations 
(26) to (32) is rather frightening and can be beyond the capacity of even the 
largest computers if the standard Simplex procedure is employed. Fortunately 
the matrix involved here exhibits a great deal of structure. Efforts are under way 
m a number of places to exploit this structure (as the structure of the ordinary 
transportation problem was exploited) .so as to reduce the computational labor. 

The problem considered here represents a simplified situation but the type of 
analysis employed is representative of what is done for more sophisticated 
models. In any actual problem, the x’s are limited to integral values so that the 
non-integral optimal solution must be adjusted to integral x’s. If, as we assumed 
in the beginning, many trips are necessary to meet the demand, this imperfection 
may not be too serious. Unfortunately, no general methods exist at present for 
handling linear programming problems in which some or all of the variables are 
constrained to be integers. 
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9. Service Station Deliveries—Short Range 

The problem just discussed permits us to look at the over-all situation on a 
long-range basis. From a short-range point of view, however, the problem is 
somewhat different and much more complicated. It becomes similar in type to 
the so-called “clover-leaf problem’ 5 or “farmer’s daughter problem” which in 
turn is related to the classic problem of the traveling salesman. In the traveling 
salesman problem, we have a number of towns which the salesman desires to 
visit in a sequence such that the total distance traveled is a minimum. In the 
farmer’s daughter problem, we have the same situation plus the additional con¬ 
straint that the salesman wishes to return to, let us say, town “A” before a cer¬ 
tain maximum time has elapsed. In the traveling salesman problem, the solution 
consists of a single loop while in the farmer’s daughter problem, the solution 
consists of a number of loops, each originating and terminating at town “A”. 
The farmer’s daughter, of course, is at “A”. 

Returning now to our delivery problem and examining it from the short-range 
point of view, it turns out that it is similar to the farmer’s daughter problem 
except for some additional complications of its own. On a daily basis, the dis¬ 
patcher at the bulk terminal has a list of service stations to which deliveries of 
certain amounts must be made today because the service stations are on the 
verge of running out of gasoline. As before, he has trucks of different types at 
his disposal and his problem now is to devise routes for the trucks so that the 
deliveries are made at minimum transportation cost. These routes, of course, 
originate and terminate at the bulk plant. Thus, the bulk plant is equivalent to 
the farmer’s daughter but instead of having only one boy friend she has as many 
as we have different types of trucks on the road. One of the important dif¬ 
ferences between the long-range formulation of this problem and the formulation 
on a daily basis is that in the former the individual trucks lose their identity 
except for the type to which they belong, while in the latter, each truck must 
be considered as an entity. 

The Operations Research Group at Atlantic became interested in devising 
means for handling this problem on a daily basis. With the assistance of George 
Dantzig, a method was devised that is not guaranteed to lead to the optimum 
solution but will usually yield a solution rather close to it. 

10. Conclusion 

We have attempted in this paper to discuss some oil industry problems and to 
indicate how linear programming was or can be used to solve them. There can be 
no doubt that linear programming has made a place for itself in the oil industry, 
particularly in the manufacturing phase. It is beginning to be appreciated by 
management as an important help in making complicated decisions. It must be 
realized, however, that not everything in this world is linear and that occa¬ 
sionally we come across constraints which are mathematically pathological 
types. This is good in a way because if ever a method is devised that solves all 
problems, life would become rather dull. Much still remains to be done. We need 
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a great deal more basic research on optimization methods in the universities and 
industrial research laboratories. 

It should be pointed out that the successful application of linear programming 
to practical problems was made possible by the advent of large, high-speed com¬ 
puters and by the existence of an efficient linear programming code. If digital 
computers were nonexistent, the answers would be many years too late. We would 
like to express our thanks to William Orchard-Hays and Leola Cutler of the RAND 
Corporation and to Harold Judd of IBM for the excellent code which they 
developed for the IBM 704, and made available to industry. 

We would like to thank the management and personnel of Magnolia Petro¬ 
leum, Esso Research and Engineering, Atlantic Refining, Arabian American Oil, 
Richfield Oil and Shell Development for their assistance and cooperation in the 
preparation of this paper. 
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QUADRATIC PROGRAMMING AS AN EXTENSION 
OF CLASSICAL QUADRATIC MAXIMIZATION* 

H. THEIL and C. VAN DE PANNE 1 

Netherlands School of Economics , Econometric Institute 

The article describes a procedure to maximize a strictly concave quadratic 
function subject to linear constraints in the form of inequalities. First the 
unconstrained maximum is considered; when certain constraints are violated, 
maximization takes place subject to each of these in equational (rather than 
inequality) form. The constraints which are then violated are added in a similar 
way to the constraints already imposed. It is shown that under certain general 
conditions this procedure leads to the required optimum in a finite number of 
steps. The procedure is illustrated by an example while also a directory of 
computations is given. 


1. Introduction 

The problem of quadratic programming consists of maximizing a quadratic 
function of a vector x, 

(1.1) Q(x) = a'x — %x'Bx, 

subject to the condition that none of the components of x be negative: 

(1.2) x £ 0, 

and possibly also subject to certain additional linear constraints: 

(1.3) C*'x g d*. 

It is assumed that the matrices a, B, C* and d* are given and that Bis a positive- 
definite n X n matrix. Further, it will prove convenient to combine the con¬ 
straints (1.2) and (1.3) such that they are written as 

(1.4) C'x S d , 

which implies C = [-J C*], d' = [0 d*']. Here C’ is an N X n matrix with 
N ^ n. 2 

Usually, a quadratic programming problem is solved by starting with some x 

* Received March 1960. 

1 The authors are indebted to Mr. P. J. M. van den Bogaard of the Econometric Institute 
for his detailed comments on an earlier version of this paper. 

2 The procedure to be proposed is also applicable when none or only some of the non¬ 
negativity constraints (1.2) are imposed. In that case we may have N < n. 
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which satisfies the constraints but which does not necessarily maximize Q, 
after which this * is replaced by a series of other arts—all of which satisfy the con¬ 
straints—until the required solution is found. 3 Quite a different procedure is 
at o maximizing Q without taking account of the constraints, to see whether 
. ® resulting solution does or does not satisfy the constraints, and to use this 
information as a basis for further computations. This approach will be followed 

ere, it was inspired by the work done by one of the authors [9] on quadratic 
criterion functions in economics, and it is based on the consideration that maxi¬ 
mizing a quadratic function subject to linear equations is so much simpler than 
maximizing subject to linear inequalities. An outline of the procedure is given 
m Section 2, a more rigorous analysis in Section 3. Section 4 contains an ex¬ 
ample, Section 5 some concluding remarks. 

2. Outline of the Procedure 
2.1. Information Supplied by the Unconstrained Maximum 
The vector which maximizes Q without regarding the constraints is 
C 2 - 1 ) x° = B~\ 

as is easily verified by straightforward differentiation. Clearly, if x° satisfies the 
constraints (i.e., if C'x° S d), then x° is the required solution because a con¬ 
strained maximum can never exceed the unconstrained mflYinurm The in¬ 
teresting possibility is therefore the one in which z° violates one or more con¬ 
straints. Consider then Fig. 1 which deals with the simple case n = 2 in which 
the only constraints are nonnegativity constraints. The condition S: 0 is 
violated by x ; so we have to be satisfied with a lower Q-value than Q(x°). Now 
given the fact that B is positive-definite, the locus of constant Q-values is a fam¬ 
ily of concentric ellipses (ellipsoids if n ^ 3) around x°. The optimal solution 
will then be such that it lies in the admissible region (the positive quadrant) 
and on the ellipse which is nearest tox°. Clearly, this is the point where an ellipse 
touches the horizontal axis; in algebraic terms, it is the vector which is obtained 
by maximizing Q subject to a* = 0, i.e., subject to the second nonnegativity 
constraint written in the form of an equation instead of an inequality. We shall 
say in such a case that this constraint is satisfied exactly by the vector con¬ 
sidered, or that it is satisfied in equational form. Since the vector considered is 
the optimal vector (to be written as x from now on), the result of the example 
of Fig. 1 can simply be described as follows: The vector of the unconstrained 
maximum violates one of the constraints, and the constrained maximum (the 
optimal vector) satisfies the same constraint in equational form. 

Next, consider Fig. 2 in which x violates both nonnegativity constraints. 
Given our experience with the case of Fig. 1, the obvious approach seems to be 
to impose the two constraints in equational form one after the other. Thus, 
when imposing xj = 0we obtain a vector x a) which violates the constraint x x § 

* For other contributions to the problem of quadratic programming, see the list of ref- 
erences at the end of this paper. 
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X*. 




Fig. 2 


0; 4 and if we impose x\ = 0 we obtain x (1) which violates no constraint and which 
is obviously—as graphical inspection shows—the optimal vector x. This result 
suggests that the rule mentioned at the end of the preceding paragraph should 
be extended as follows: If the vector of the unconstrained maximum violates 
one or more constraints, one of these is satisfied exactly by the optimal vector. 

Further investigations show that the expression “one of these” is to be re¬ 
placed by “at least one of these.” For example, if we consider Fig. 3 and note 
that x° violates the constraint x 2 ^ 0, then the obvious approach is to impose 
x 2 = 0. But the resulting vector x (2) violates x x ^ 0, and it is easily seen that 
the optimal vector is the origin, x (1 ’ 2 \ which satisfies both constraints exactly. 
The situation is precisely the same when we have additional constraints besides 
the nonnegativity constraints. This case is illustrated in Fig. 4 for two non¬ 
negativity constraints (indicated by 1 and 2) and two additional constraints 
(3 and 4). The shaded area is the admissible region. It is seen that x° violates 
constraints 3 and 4, that maximizing subject to 3 or to 4 in equational form does 

4 We shall introduce the notation x (2) for the vector x which is obtained by maximizing 
Q subject to constraint 2 in equational form, x C2,4) for the vector obtained by maximization 
subject to constraints 2 and 4 in equational form, etc. 
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Fig. 3 



Fig. 4 


not lead to x [because both x (3) and z (4) violate one constraint, viz, 4 and 3 
respectively], and that x satisfies both 3 and 4 exactly. Here, therefore, all con¬ 
straints violated by z are satisfied exactly by x. All these cases discussed so far 
are covered by the following Rule, the proof of which wifi be given in Section 3: 

-tttiiE 1. If x (the vector of the unconstrained maximum) violates certain con¬ 
straints, then x (the optimal vector ) satisfies at least one of these exactly. 


2.2. Maximization Subject to Subsets of Constraints in Equational Form 

The preceding discussion shows that * is found by maximizing Q subject to a 
certain subset S of the N constraints C'x ^ d written in equational form (C’x = 
d) Of course, m general we do not know this particular S and, in fact, our main 
task will be to find it. Even so, it is important to observe at this stage that we 
can easily derive for any S the vector t? which maximizes Q subject to S in 
equational form. Let us therefore arrange the IV constraints such that those of 


m„ m T ?n SUt ? m ? y . be the 6mpty Set ’ viz ’ when the sector of the unconstrained maxi- 
d f not violate any of the constraints ( x° = x). Note also that in some cases the 

™r f ma ™ a c 0rl SUbieCt t0 S in equational form is trivial, viz, when S is such that 
only one * satisfies S in equational form. This is the case in Figs. 3 and 4. 
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S are the first ones, and let us denote by T the set of constraints not in S. Then 
the coefficient matrices of (1.4) can be partitioned according to 

(2.2) C = [C 8 Ct] ; d = [j[*] . 

Consider then the well-known Lagrangean expression 

(2.3) a'x — ix'Bx — \' 3 (C' 3 x — d 3 ), 

where \ 3 is a vector of Lagrange multipliers and C' 3 x — d 3 = 0 are the con¬ 
straints in S written in equational form. Differentiating (2.3) with respect to x, 
we obtain 


(2.4) x s = x° - B^CsXs . 
Premultiplying (2.4) by C' 3 gives 

(2.5) C'sB-'Csks = C'sx 0 - d 3 , 
because Csx s = d 3 . If we now define 


( 2 . 6 ) 

(2.7) 


E = C'BT'C 


r C' 3 Br 1 C 8 
IC'tB- l C 3 

- c,/ - ■* - [£/ 


C's B~ 1 C‘ 
C'tB-'C■ 


d 3 

dr 


l_n? s F'l 

J " If Etj 


say; 


it is easily seen that (2.5) implies 

(2.8) Xs = E a l e s , 

which expresses X 5 in known quantities; the existence of Es 1 is ensured if the 
rank of Ca equals the number of constraints in S. Furthermore, by premul¬ 
tiplying (2.4) by C' T we find 

(2.9) C’ T x s = C f T x° - C' T B- l Cs\s , 

and the right-hand side should be ^ d T in order that x s satisfies the constraints 
in T. Applying (2.6), (2.7) and (2.8), we find that this condition can be written 
in the simple form 

(2.10) FE?e a - e T £ 0. 


As will appear below, the left-hand side of (2.10) is the only thing that needs to 
be computed for the relevant subsets S of the constraints. It is also easily verified 
that the corresponding Q-value is 

(2.11) Q(x s ) = ix°’Bx a - | e’sEj'es = - he'sEl'es , 

because Q(x s ) = hx'Bx - §(rc s - x°)'B(x s - x) follows from (1.1) and 
(2.1), and ( x s — x)'B{x s — x) = e'sE^es from (2.4). 
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2.3. Further Steps of the Computation 

The above shows that we can in principle derive x by considering x s for all 
subsets of the N constraints (1.4), viz., by verifying whether they satisfy the 
constraints and by computing their Q-values, see (2.10) and (2.11). However, 
this is far from efficient for several reasons. Firstly, it follows from Rule 1 that 
w-e can confine our attention to those sets S that contain at least one constraint 
violated by x . Secondly, it obviously makes no sense to consider a set S of 
which the constraints are contradictory when written in equational form. For 
example, if two of the N constraints are 

*i + Xi ^ 1 and z x + z 2 g 5, 

then no x exists for any S containing these two constraints. Thirdly, we can 
extend Rule 1 in a manner which may be described as follows. Suppose z° vio¬ 
lates constraints 1, 2, and 3. It seems obvious to compute in the next round x s 
for those sets S which consist of one constraint. We can then confine ourselves 
%* ’ * ’ x [corresponding to S = (1), (2), (3), respectively], because 
x - z for i ^ 4 is impossible in view of Rule 1. Suppose then that none of 
the three vectors z , x , z (s) satisfies all constraints; more precisely that x a) 
violates constraints 3 and 5, that z® violates 3, and that z® violates 1 and 2. 
The question arises what to do next. The answer is supplied by 

Rule 2. Suppose that two or more constraints are satisfied exactly by x and par¬ 
tition the set of these constraints into two subsets, S and S', containing at least one 
constraint each. Then x (the vector which maximizes Q subject to S in equational 
form) violates at least one constraint which is an element of S '. 6 

Applying this Rule to our example, we observe first that it is indeed applic- 
able b (® Cause £ has t0 safcisf y at least two constraints exactly: both x and z (1) , 
x , z violate at least some constraint, so Rule 1 does not admit the possi¬ 
bility of an t which satisfies less than two constraints in equational form. Sup- 
pose now for a moment that constraint 1 is one of the constraints which is satis¬ 
fied exactly by z. Then Rule 2 states that z (1) violates some constraint which is 
satisfied exactly by z, which means that z must satisfy exactly, not only con- 
stramt 1, but also 3 or 5; and so in the next round, when considering all relevant 
z s for two-element constraint sets S, we should take S = (1, 3) and (1, 5). 
However, this argument is based on the assumption that constraint 1 is one of 
tiie constraints satisfied exactly by z; and we cannot be sure that this is true. 
The only thmg we can be sure about is that z satisfies either 1 or 2 or 3 in equa¬ 
tional form, because these are the constraints violated by z°. Hence we must 
repeat the same argument under the alternative assumptions that z satisfies 
2 or 3 exactly. Assuming then that constraint 2 is one of the constraints satisfied 
by z, we find that £ = (2, 3) is the two-element constraint set to be considered 
m the next round, since z violates constraint 3. Similarly, assuming that 3 is 
one of the constraints satisfied by z in equational form, we find that S = (3, 1) 
and (3, 2) are to be considered. As a whole, therefore, five two-element con- 

6 For an exception to this Rule, see the last paragraph of Section 2.3. 
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straint sets appear: (1, 3), (1, 5), (2, 3), (3, 1), (3, 2), from which however 
the last two can be deleted since they are identical with the first and the third 
respectively. For each of the remaining three we have to verify whether their 
x s does or does not violate certain constraints. If none of these x 8 ’s satisfies all 
constraints, we have to proceed to three-element constraint sets. This does not 
lead to any novel features, as will be seen in Section 4; Rule 2 is then applicable 
just as it was here, its vector x s being then interpreted as corresponding to a 
two-element rather than a one-element set S. 

It is to be noted that there is one exception to Rule 2. It may happen that x s 
coincides with x even though x 8 is imposed to satisfy fewer constraints than 
x does. This is a degenerate case in which x 8 happens to satisfy in equational 
form one or more constraints which do not fall under S. An example is given in 
Fig. 5: maximizing Q subject to constraint 1 in equational form leads to a vector 
x a) which happens to satisfy constraint 3 exactly. Hence x (1) = x a,3) (= x) 
in this case. In this article it will be assumed that there are no such problems of 
degeneracy. 7 

2.4. Completing the Computation: Verification of the Solution 

The procedure described above amounts to considering first the vector of the 
unconstrained maximum, then the vectors which maximize Q subject to certain 
one-element subsets of constraints in equational form, then vectors corresponding 
to two-element subsets, and so on. At a certain point we shall arrive at a vector 
x s which violates no constraint, and the question then arises whether this is the 
vector x which we look for. It would be convenient if this would always be true, 
but unfortunately this is not the case as can be shown by means of the example 
illustrated in Fig. 6. There x° violates 3 and 4, so that in the next round we con¬ 
sider x m and x {4) . Doing so, we find that x (3) violates 4 [implying that x (z,4) 
is to be considered next] while x (4) violates 2 [so that x i2 ' 4) is to be considered]. 
Now x (3,4) violates no constraints, so that it might be the optimal vector; but 
x (2,4) does not violate any constraint either and hence it is clear that a special 

7 Some partial results on the problem of degeneracy have been obtained, but they are 
not reported here. We expect to come back to it in a later publication. 
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rule is necessary in order to find out whether or not such an x s equals x . This is 
provided by 

Rule 3. Suppose that for some subset S of the constraints , x s exists and violates 
none of the constraints; then x s — x if and only if every x sh violates the h-th con¬ 
straint where S h is the set of all constraints satisfied exactly by x s excluding the h-th. 

The application of this Rule to our example runs as follows. Considering x (z,4) 
we observe that (3, 4) is the set of constraints that are satisfied in equational 
form, so the sets S h to be analyzed are the one-element set (3)—obtained by 
excluding constraint h = 4—and the one-element set (4), obtained by excluding 
h = 3. Hence we have to verify whether it is true that x {Z) violates constraint 4 
and that x i4) violates constraint 3. An inspection of Fig. 6 shows that this is the 
case as far as x m is concerned, but not for x i4) : this vector violates constraint 2, 
not 3. Next, consider x (2f4) ; it satisfies constraints 2 and 4 exactly, so we have to 
consider x (4) and x (2) and to verify whether x (4) violates constraint 2 and x (2) 
violates constraint 4. The answer is positive as Fig. 6 shows and we can therefore 
conclude that x (2,4) = x. Of course, this result is immediately obvious by graph¬ 
ical inspection, but a graphical device works only when the number of variables 
is very small. When we deal with more than a few variables, an algebraic device 
like that of Rule 3 cannot be avoided. 



Fig. 6 
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3. Analysis of the Procedure 

Problem. Maximize (1.1) subject to (1.4), wfiere all vectors and matrices are 
real-valued, the vectors x, a, d containing n, n, N elements respectively, the matrix 
B being of order n X n and C of order n X N. The vectors and matrices a, B, C, d 
are known. 

It will prove very convenient, to apply the language of set theory to the sets 
of N constraints C'x S d. In particular, we shall consider subsets consisting of 
constraints that are satisfied exactly by some x ( c\x = d h for some h = 1, • • • , 
N) as well as subsets of constraints that are violated by x (c\x > d h ). Further, 
we shall write 0 for the empty set,* 1 S = S' if the sets S and S' are identical, 
S 9± S' if they are not, S C S' if all elements of S are also elements of S', 
h 6 S if h is an element of S, SS' for the set of elements both in 5 and in S', 
S + S' for the set of elements either in $ or in S' (with the understanding that 
there are no elements both in S and in S', i.c., SS' = 0), and S — S' for the 
set of elements in S but not in S'. 

Definition 1. For each vector x of n elements, U(x) is the subset of the N con¬ 
straints C'x & d that are satisfied exactly by x (in equational form), and F(z) 
is the subset of the N constraints violated by x. 

Definition 2. Subject to conditions of existence and uniqueness, i is the vector 
which maximizes (1.1 ) subject to (1.4), and x* is the vector which maximizes (1.1) 
subject to some subset S of the N constraints C'x ^ d in equational form. 

We shall make three assumptions, the first of which is made in order to ensure 
that a solution exists (which requires that there is at least one x satisfying the 
constraints) and that there is maximization in a nontrivial sense (which re¬ 
quires that at least two different *’s satisfy the constraints). The second assump¬ 
tion ensures that the solution is unique, the third that there are no problems of 
degeneracy. The last assumption implies that maximizing Q subject to any sub¬ 
set S of the constraints in equational form leads to a vector x H which satisfies 
only the constraints of S exactly. 

Assumption 1. There exist at least two vectors x and x* of n elements each such 
that x 9 * x* and V(x) ■■■■■ V(x*) — 0. 

Assumption 2. The matrix B is positive-definite. 

Assumption 3. For each subset S of the N constraints C'x S d for which x* exists 
and is unique, f/(x*) =•= F. 

We shall first prove two lemmas and then proceed to the main theorems. 

Lemma 1 . For any two vectors x and x * of n elements each, 

(3.1) Q(y) > Min [Q(x), Q(x*)\ where y - Ox + (1 - 6)x*, 
provided that x y* x* and 0 < 6 < 1. 

" As a general rule, the symbol 0 will occur in this section only as the empty set. There 
is only one exception to this rule: it also occurs in the expressions like 0 < 9 < 1 to indicate 
that 6 is a positive number smaller than 1. No confusion is likely to arise. 









138 


IV-9—DETERMINISTIC DECISION MODELS 


oMKT iratiC t f 7f i0n ® “ S “ C% “ ncave -wiimnsional 

interval of its arguments (Assumption 2), hence 

Q{0x + (1 - $)x*} > 8Q(x ) + (1 - 6)Q(x*) 

= Q{x) + (1 - 8){Q(x*) - Q ( x )} 
= Q(x *) + 8{Q(x ) - Q{ x *)\ 

I %f ““ °, < " 5 L (3.1) follows immediately. 

2. For any vector x of n elements, 

(3 ‘ 2) ^(x)F(x) = 0. 

Far any two vectors x and x* of n elements each, if y = 8x + (1 - e)x*, then 

(3.3) U(x)U(x*) c U(y) for any 6; 

(3.4) V(y) = 0 if V(x) = V(x*) = 0 and 0 g 6 £ 1; 

(3.5) V(y) = 0/or some 8 (0 < 8 < l) if V(x) = U(x)V(x *) = 0. 

^ ~ d * and o'** > d h contradict each other. 

(3.3): If A 6 U(x)U(x*), then c\x = c\x* = d h and hence 

c ' hV = 6c ' kX + (1 - 6)c' h x* = 8d h + (1 - e)d h = d h . 

n ^ ( ^w = *V^ P ? eS ^ C ' Aa: * = dk for a11 h - ^nce 
. if fr. B ^ c hX = for all h provided that 

V(v) = lZ { nll\ 7 5t is ,. giVen that = W) = 0 and hence 

{y) 0 for all 8 m (0, 1) according to (3.4). Assume next 7(a:*) ^ 0- this 

implies a: ^ x , forx = a;* is contradicted by V(x) = 0 ^ V(x*). If h F V(r*) 

t en c\x* > d h and c\x < d A because c\x > d h is excluded by V{x) = 0 and 

?£ " * 18 ^“ ty = 0. But c'ia; < *, c',a:* > dh implies 

For a nT/w ** SUch , that 6 ^ x + (1 ~ Wti? = d h and 0 < 8 h < 1. 
For aU he V(x ), write 6’ = Max, 8 h ■ then V(y) = 0 if 8' g 0 <: 1 

^ftence *nd uniqueness). TAere * *mc% one rector 
Definition 2, and for each subset S of the N constraints C'x <L d there is either ex¬ 
actly one vector x satisfying Definition 2 or none at aU. 

Proof- Assumptions 1 and 2 ensure that there exists at least one vector which 
mizes Q. Suppose that there are two different vectors x and x* which both 
maxumze Q subject to the constraints. Then for any y = 8x + (1 - d)x* 

T X ’£ ( S > = «(*) because of Wa 1 and V(y) 2 0 

tha^Tth ^ a V' X ^. .° [ Lemma 2 > (3-4)]. This contradicts the assumption 
that both x and a:* maximize Q subject to the constraints. P 

tionTfol^ 17 b0th ** ““1 »* * ** ““““ « object to S in equa- 
tional form Then for any y = dx s + (1 - 8)y s such that 0 < 8 < 1 we 

have Q(y) > Q (> ) = Q(y ) because of Lemma 1 and S = U(x s ) U(y*) c 

Note thttTc 6 L T na i 2 ’i 3 't ) ’ WhiCh COntradicts a S ain the .assumption made. 
Note that we cannot exclude the possibility that no ai exists because S may be 
contradictory in equational form. y 
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Corollary 1 . For any vector x of n elements , 

(3.6) V(x) = 0 implies either x — x or Q(x) < Q(x). 

For any x and any subset S of the N constraints C f x S d such that x s exists , 

(3.7) S C U(x) implies either x = x 8 or Q(x) < Q(x 8 ). 

Proof . Trivial, given the existence and uniqueness of x and the uniqueness of 
x 8 if it exists. 

Theorem 2 (exploring the attainable summit). For any subset S of the N con¬ 
straints C'x S d such that x s exists, exactly one of the following possibilities applies : 

I. (approaching the summit). U(x s ) d U(x) ^ U(x s ) implying 

U(x)V(x s ) ^ 0. 

II. (reaching the summit). U(x s ) = U(x) implying x s = x. 

III. (leaving the summit). U(x) <Z U(x s ) ^ U(x) implying Q(x s ) < Q(x). 

IV. (missing the summit). U(x) U(x)U(x 8 ) 5 * U(x 8 ) implying either 
V(x s ) ^ 0, or V(x s ) = 0 and U(x s )V(x sk ) = 0 for some S h = U(x s ) - (h) 
where (h) is a one-element subset of the constraints satisfying 

h 6 U(x s ) - U(x)U(x s ). 

Proof. The possibilities listed are the only ones, because the intersection 
U(x)U(x 8 ) is either identical with both sets (II), or identical with one of them 
and a proper subset of the other (I, III), or a proper subset of both (IV). 

I. We have x 5 ^ x 8 , because x = x s would imply U(x) = U(x s ). Then 

5 = U(x s ) C U(x) implies Q(x) < Q(x s ) according to Corollary 1, (3.7); and 
this implies V(x s ) 5 * 0 according to (3.6). Assume U(x)V(x s ) = 0. Then 
some 0 exists such that 0 < 6 < 1 and V(y) =0 where y = Ox + (1 — 6)x s 
[Lemma 2, (3.5)] and Q(y) > Min [Q(x), Q(x 8 )] = Q(x) (Lemma 1). This is a 
contradiction; hence U(x)V(x 8 ) ^ 0. 

II. We have S = U(x s ) = U(x ), so either x = x s or Q(x) < Q(x s ) in 
view of (3.7). Assume x 5 * x 8 , in which case V(x s ) ^ 0 [because if x 5 * x s 
and V(x 8 ) = 0, then Q(x s ) < Q(x) according to (3.6); and this is impossible 
in view of our previous conclusion: either x = x 8 or Q{x) < Q{x 8 )]. Now 
U(x)V(x 8 ) = U(x 8 )V(x 8 ) = 0 [Lemma 2, (3.2)]; and hence, noting that 
V(x) = 0 by definition and applying Lemma 2, (3.5), we conclude that some 

6 exists such that 0 < 6 < 1 and V(y) = 0 where y = Ox + (1 — 8 )x s . 
But also Q(y) > Min [Q(x), Q(x 5 )] = Q(x) according to Lemma 1. So the 
assumption x 5 * x s leads to a contradiction, hence x = x 8 . 

III. Here x s x because x 8 = x would imply U(x s ) = U(x). Also 

U(x)V(x s ) a U(x 8 )V(x s ) = 0 

in view of Lemma 2, (3.2). Then, for some 6 such that 0 < 6 < 1, V(y) = 0 
and Q(y) > Min [Q(x), Q(x 5 )] where y = Ox + (1 — 0 )x 8 ; see Lemma 2, 
(3.5), and Lemma 1. Hence necessarily Q(x s ) < Q(x). 

IV. We have either T^rc 5 ) ^ 0 or V(x <s ) = 0. Since the statement specifies 
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Pig. 7 

nothing with respect to the first possibility, 9 we consider V{x s ) = 0. Further, 
we take V(x ) 9 ^ 0 for all S h = U(x 3 ) — ( h ) = S — (h) [where 

h € U(x s ) - U(x)U(x s )\ 

because the assertion U(x s )V(x sh ) = 0 for some S h is trivially true if 

V(x sk ) = 0 

for some S k . Then x x s [because x = x s would imply U(x) = U(x 8 ) = 
U(x)U(x s )} and * i, z s * ^ ** for all S 4 because 

V{x sh ) 9 * V(x) = F(x s ) = 0. 

Assume U(x s )V(x sh ) 9 * 0 for all S h and consider y = 6x + (1 — 0)£ 6 h z sh 
where 0 < 6, 6 h < 1 and £ 6 h = 1 “ Write also c'*x = d k - S k and 

c'kZ S — dk *kh 

for * € ~ U(x)U(x s ). Then 4 > 0 [because 4 < 0 is excluded by 

V{x) = 0, and5* = Oby ( k)U(x ) = 0], €** = Oif A ^ A [because (Jc)U(x sh ) = 
(k)U(x ) — (A)(A) = (A)E/(a; 5 ) = ( k ) if A ^ A], and e** > 0 if k = A 
[because €*a = 0 if A 5 ^ A, = 0 if A = A for all k and fixed A would contra¬ 
dict U(x )V(x ) 7*^ 0]. Applying this 5, e-notation, we have 

C k V ~ Q(dk “ 8*) + (1 — 0)X>0ft(d;fc + €JWt) 

= d k — 68k + (1 — 6)6 k €kk 

for any k € t/(x 5 ) — U(x)U(x s ). Now if we choose 6 , 0* such that 

05* = (1 ~ 0)0*6*;* , 

9 This possibility is a real one, see Fig. 7. There we have £ = U(x s ) =* (1), U(x) = (2) 

F(a: 5 ) =5 (2). ’ * 

10 In the special case when S is a one-element constraint set, h takes only one value and 
tf* must be taken as 1 The case S = 0 is excluded in Possibility IV, because S = 0 implies 
U(x s ) = U(x)U(x s ) (=0). 
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this means U(x s ) - U(x)U(x s ) c U(y). 11 But also U(x)U(x s ) c tf(y), as 
follows from repeated application of Lemma 2, (3.3) ; 12 and hence U(x s ) C U(y). 
However, Q{y) > Min* [Q(x), Q(x sh )], as follows from repeated application of 
Lemma l; 13 and Q(x) > Q(x s ) because V(x s ) = 0 and x ^ x s [see (3.6)] and 
Q(x sh ) > Q(x s ) because S h c U(x s ) and z* h ^ [see (3.7)]. Hence 

Q(y) > Q(x s ). 

But we just derived S = U(x s ) C U(y), which implies either y = x s or 

Q(y) < Q0&*)> 

both of which contradict Q(y) > Q(x 5 ). Hence U(x s )V(x sh ) = 0 for some 
such that A £ U(x s ) — U(x)U(x s ). 

Corollary 2. x = x s for some subset S of the N constraints C'x ^ d. 

Proof. Write $ = U(x) and consider aA We have U(x) = & = U(x s ) which 
implies that we are in Possibility II; hence x = aA 
Corollary 3. x — x° if and only if V(x°) = 0. 

Proof. The necessity of the condition V(x°) = 0 is obvious, so we confine 
ourselves to the sufficiency. Applying (3.6), we find that V(x°) = 0 implies 
either x° = x or Q(x°) < Q(£); applying (3.7), we find that 0 C U(x) implies 

11 When $ is a one-element constraint set, we take 0* — 1 and 6 — eW(5* + «**); see foot¬ 
note 10. In the general case, when the index Jc takes p values (say), the 6k are to be specified 
such that 


6i €n dp €pp 



and such that XX = 1* Given the positive signs of the 5*s and «’s involved, this leads to 
unique positive values of the 0 *. Finally, 0 = Bk^kk/ ( 0 * + 0 ^**) for any k. 

12 Wehave U(x)U(x s ) c U(x) and U(x)ZJ(x s ) C U(x sh ) for all S h such that h € U(x s )• - 
f / (£) U (x s ). Hence U (x) U (x s ) is a subset of the intersection of U (x) and all relevant U (aA); 
and the statement made in the text is proved when it is shown that all constraints of this 
intersection are satisfied exactly by any linear combination y of x and the x - This can 
either be proved directly [by means of a trivial extension of the proof of (3.3)] or by repeated 
application of (3.3), as follows. Write the vectors x, x s as z i, z 2 , ... and y — with 

T m — 1. Supposing that i takes 3 values, we can write 


y 


^ €j Zi (1 “ €l) 


€2 32 €3 23 

€2 + €3 


so that U(zi)U(z*) c: U(y) according to (3.3), where z* = (€ 2 z 2 -h e z z z ) / (e 2 + e z ). But 
U(z 2 )U(z z ) c U(z*), hence U(zJU(z 2 )U(z z ) c U(y). This is easily extended to the case in 
which i takes more values. 

13 In the notation used at the end of footnote 12, we have to prove Q(y) > 
Min [Q(zi), Q(z 2 ), Q(z z )]. Now Q(y) ^ Min [Q(«i), Q(z*)] and Q(z*) ^ Min [Q(«s), Q(«a)], 
where the equality sign holds if and only if the two vectors between square brackets (zi, 
z* and z 2} z z ) are equal. Hence the statement made holds except when the vectors z 1} z 2 , 
z 3 are all equal. This exception does not occur here, since x differs from the x s as follows 
from V(x) = 0 ^ V(r sA ). The extension to the case of a larger number of vectors is equally 
simple. 
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either x — x or Q(£) < Q(x°). Hence x = x (which means that we are in 
Possibility II of Theorem 2 for S = 0). 

Corollary 4 (Rules 1 and 2). 14 // S + S' = U(x) and S' * 0, then x s exists 
and S'V(x s ) * 0. 

Proof. The existence of x follows from the fact that the constraints of $ are 
not contradictory in equational form; for if they were contradictory, so would 
those of S + S' = U{x) be, implying the non-existence of x (which is ruled 
out by Theorem 1). Considering Possibility I of Theorem 2, we find that it 
implies U(x) V(x s ) = (S + S')V(x s ) = S'F^) ^ 0 because 


SV(x 3 ) = U(x 3 )V(x 3 ) = 0 


in view of Lemma 2, (3.2). The other possibilities of Theorem 2 are all excluded, 
because U(x s ) = S C S + S' = U(x) ^ U(x 3 ) leads to Possibility I only. 

Corollary 5 (Rule 3). Suppose that for some subset S of the N constraints 
C x ^ d, x exists and V(x ) =0. Then x s = x if and only if 


U(x 3 )V(x 3h ) * 0 


for all S — S — (h) where ( h ) is a one-element subset of the constraints such 
that h € S. 

Proof, (a) Necessity: We have to prove for any x sh that if x 3 = x [in which 
case it is necessarily true that V(x s ) = 0], U(x 3 ) V(x sh ) * 0. Suppose 

U(x 3 )V(x 3h ) = U(x)V(x sh ) = 0 

for some x 3 \ Consider then y = ex + (1 - e)x 3h for 0 < 8 < 1; since 
U(x 3h ) = S h ^ S = U(x 3 ) = U(x ) 

we have x ^ x 3 * and hence Q(y) > Min [Q(x), Q{x 3h )\ in view of (3.1); also 
Q{i) < Q(x ) because S h C U(x s ) = U(x) and ^ x, see (3.7). Hence 
Q(y) > Q(x). However, V(y) =0 for some 6 such that 0 < 6 < 1, as follows 
from U(x)V(x 3 ) = 0. This involves a contradiction, hence U(x 3 )V(x 3k ) ^ 0 
for each x s . 


(b) Sufficiency: We have to prove that if V(z s ) = 0 and if U(x s ) V(x sh ) ^ 0 
for each x , then x = x. Considering Possibility I first, we observe that it 
must be ruled out because its implication U(x)V(x s ) ^ 0 is contradicted by 
V(x ) = 0 which is given. The same applies to Possibility IV, because it im- 

U(X l V(x ) = 0 for s ° me 2:5 if V(x 3 ) = 0, which is contradicted by 
U(x )V(x ) 0 for each x s . As to Possibility III, consider 


y = 0x + (1 - 6) £ e h x sK 

where 0 < 6, 6 k < 1 and 6 k = l, 16 the constraints h over which summation 
takes place satisfying h € U(x 3 ) - U(x). Wehave Q(y) > Min A [Q(x), Q(x 3h )] 

14 Rule 1 deals with the case S = 0, Rule 2 with St* 0. 

A 15 ® xce Pt that one must take e h = 1 if S is a one-element constraint set (in which case 
x 5* x°, given that Possibility III is assumed to apply to S here). 
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as follows from repeated application of Lemma l ; 16 also Q(z) > Q(x s ) because 
this is the implication of Possibility III; and Q(x sh ) > Q(x s ) for all x sh (all 
h 6 S) because of (3.7) [the possibility x sh = x 8 being excluded because 

U(x sh ) = S h ^ S =. U(x 8 )]. 

Hence Q(y) > Q(x s ). Further, we have U(x) C U(x sh ) for all x sh considered 
here [because U(x) C U(x s ) and A £ £7(x 5 ) — E/(x)], so TJ{x) C [/(p ). 17 But 
in addition to this, we can choose 6, 0 a such that i7(x 5 ) — ?7(x) d U(y), lB in 
which case U(z s ) C ?7( 2 /). This, however, contradicts Q(y) > Q(x 8 ) according 
to (3.7). Hence Possibility III is also ruled out. So only Possibility II remains 
implying x s = x. 


4. An Example; Directory of Computations 

The computational procedure will be illustrated by means of an example 
used by Houthakker [4] for the illustration of his capacity method. He considers 
a monopolist who faces four linear demand functions for his four products: 


(4.1) 


* = 18.239 - 2.086pi + 0.255p 2 + 1.033p* - 0.374p 4 

x 2 == 1.898 + 0.255pi - 0.499p 2 - 0.129p 3 + 0.217p 4 

x 3 = —4.916 + 1.033pi - 0.129£ 2 - 0.759p 3 + 0.254p 4 

x 4 = 7.923 - 0.374pi + 0.217p 2 + 0.254p 3 - 0.512p 4 


where the x’s are the quantities produced and sold and the p’s prices. The problem 
is to maximize total gross revenue subject to certain constraints. Total gross 
revenue is of the form P&i > which is a quadratic form in the prices given 
that the x’s are linear in the prices, see (4.1). But we may also express the 
programming problem in quantities instead of prices by solving the system 

(4.1) for the p’s, which leads to four “inverted demand equations” which are 
linear in the x’s; and this, in turn, makes X) quadratic in the x’s. As long 
as there is no problem of uncertainty about the numerical values of the coeffi¬ 
cients of the problem, it does not matter whether we work with p’s or x’s. Fol¬ 
lowing Houthakker, we shall use the x-approach, which leads to the objective 
function 

4 

(4.2) Q(x) — X) piXi = 18xi + 16x 2 + 22 x 3 + 20x 4 

1 

— I; { 6 xi 2 + 2 x x x 2 -j- 16 xix 3 4 - 10 x 2 2 4 ~ 2 x 2 x 3 4 " 8 x 2 x 4 4 “ 17x 3 2 4 ~ 6 x 3 X 4 - 4 - 1 lx 4 2 }, 


16 See footnote 13. We have V{x) =0 5 ^ VCrc 5 *), hence x 7 * x&; so the exception men¬ 
tioned in that footnote occurs neither here nor there. 

17 The proof is entirely similar to that of footnote 12. 

18 The proof is entirely similar to that of footnote 11 and the accompanying text of the 
proof of Possibility IV of Theorem 2. 
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or in the matrix notation of (1.1): 


(4.3) 


a = 


”18" 


"6 

1 

8 

O' 

16 

22 

; B = 

1 

8 

10 

1 

1 

17 

4 

3 

20 


0 

4 

3 

11 


There are two types of constraint subject to which maximization takes place. 
First, there is the requirement that none of the quantities be negative: 


1. 

2 . 

3. 

4. 


% ^ 0 
ii^O 

x 4 S: 0. 


Second, there are constraints due to the limited availability of certain factors 
of production. Thus, the production of each unit of xi , % , x 3 or x 4 requires 1 
mnt of a factor A of which the supply is limited to If units; and there is a factor 
B which is used for a* and x 3 , and a factor C which is used for x 2 and x 4 both 

of which are in limited supply. So we have three additional constraints which 
are specified as 

5. zi + x 2 4- x 3 + Xi ^ If 

6. 5xi + 10x 3 g 2 

7- 4a* + 5x 4 g 3. 

Combining these seven constraints, we arrive at the general form C'x < d when 
we specify ~ 


(4.4) 


C 


r-i 

0 

0 

O' 

1 

r o ■■ 

0 

-1 

0 

0 


0 

0 

0 

-1 

0 


0 

0 

0 

0 

-1 

; d = 

0 

1 

1 

1 

1 


I 2 
1-3 

5 

0 

10 

0 


2 

. 0 

4 

0 

5^ 


.3 


The eomputationa 1 procedure can then be described conveniently in terms 
of the following three steps: J 

Initial Step. Compute the vector e defined in (2.7), i.e., 
e = C'x 0 - d = C'B~ l a - d. 

If all elements of e are nonpositive, then the unconstrained maximum x° = B^a 
satisfies aU constraints and no further computations are necessary. If one or 
more elements of e are positive, compute E defined in (2.6) i.e. E = C"TT l r- 
proceed then to the Intermediate Steps. ’ ’ 

In our example, we have for -e: 19 

-e = {4.560 0.475 -1.229 1.981 -4.119 -8.508 - 8.802}, 

in thre h e e onlT tati0nS ^ ^ “ five dedmal places ’ but th ^ «e reported here 
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which shows that x° violates constraints 3, 5, 6, 7. So we have to compute E, 
which is the symmetric matrix 


' 1.043 

-0.128 

-0.516 

0.187 

-0.586 

-0.051 

-0.426' 

-0.128 

0.250 

0.064 

-0.108 

-0.078 

-0.007 

-0.457 

-0.516 

0.064 

0.379 

-0.127 

0.200 

-1.211 

0.377 

0.187 

-0.108 

-0.127 

0.256 

-0.208 

0.333 

-0.846 

-0.586 

-0.078 

0.200 

-0.208 

0.673 

0.936 

1.352 

-0.051 

-0.007 

-1.211 

0.333 

0.936 

12.363 

-1.636 

l- 0.426 

-0.457 

0.377 

-0.846 

1.352 

-1.636 

6.056. 


Intermediate Steps , No. 1. Set up a “sign table of quadratic programming” 
which is a rectangular array of signs (plus, minus, or zero) the rows of which 
correspond to the N constraints, the columns to vectors x s obtained by maximiz¬ 
ing Q subject to a subset S of these constraints in equational form; these signs 
are for any x s the signs of the successive elements of FEJ l e s — e T , which should 
be nonnegative in order that x s satisfies the constraints, see (2.10). Indicate 
then the signs of —e for x° in the first column (we have —e = FEs l e s — e T if 
S = 0) and write the constraint numbers corresponding to negative signs in 
the headings of the next columns. After this, compute FEJ 1 e s — e T for all con¬ 
straint sets S which consist of the single elements in the headings just-men¬ 
tioned (in accordance with Rule 1); indicate the signs of the successive elements 
of FEs l e» — e T in the relevant place of their column, a dot being used for those 
signs which are imposed to be zero. As soon as a column emerges which has no 
minus signs, the further parts of this step are to be omitted and one should 
proceed to the Final Step immediately; when all columns have at least one 
minus sign, one has to proceed to Intermediate Step No. 2. 

The signs of the first column are supplied immediately by the Initial Step, 
and so we write 3, 5, 6, 7 in the headings of the next columns (see the Sign 
Table below). We then have to consider FEJ 1 e s — e T for S = (3), (5), (6), 


SIGN TABLE OF QUADRATIC PROGRAMMING: 
HOUTHAKKER’S MONOPOLIST 



* No constraint imposed in equational form. 
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(7). For S = (3) e.g. this is 

'-4.5601 r 2.886" 

-0.475 0.684 

[0.379] -1 (1.229) - ~ L981 = !-570 

4.119 - 3.473 ’ 

8.508 -12.430 

i. 8.802j [ — 7.582j 

and the six resulting signs are specified in the second column of the Sign Table- 
the dot in the third row indicates that the third constraint is imposed in equa- 
tional form. It is seen that each of the four vectors x (8 \ x (5 \ x (6 \ x^ 7) violates at 
least three constraints, so we have to proceed to Intermediate Step No 2 
Intermediate Steps, No. 2. Indicate in the headings of the open columns next 
to the columns of the one-element constraint sets which were prepared in Inter- 
mediate Step No. 1, the two-element sets which are to be considered next- do 
so in accordance with Rule 2, viz., by combining the constraint which is imposed 
in equational form with each of the violated constraints. Compute then 

FEg^es — € T 

for each of the resulting (two-element) sets S. As soon as a column emerges 
without minus signs, the further parts of this step should be omitted and one 
should proceed to the Final Step immediately; otherwise one has to proceed to 
Intermediate Step No. 3. 

In our case we have to consider 8 two-element sets; we note that six additional 
sets viz., (5, 3), (6, 3), (6, 5), (7, 3), (7, 5), (7, 6), need not be analyzed sepa¬ 
rately because they occur in reverse order also [like (3, 5), (3, 6), etc.]. We then 

compute FE S e s - e T for each of these eight S’s. For example, taking S = ( 3 5) 
we have ’ " 


-0.516" 
0.064 
-0.127 
0.200 
- 1.211 
i. 0.377 j 


-0.580 

-0.078 

-0.208 

0.936 

1.352 


0T 1 T 1.2291 
Bj 1_4.119J 


—4.560 
—0.475 
-1.981 
8.508 
8.802 


- 0.002 

0.707 

-2.809 

-0.527 


and &e resulting signs (with dots inserted in the third and the fifth place) are 
specified m the column under (3, 5). It is seen that in each of the eight columns 
there are at least two negative entries, so we proceed to Intermediate Step No. 3. 

Intermediate Steps, No. 3. In accordance with Rule 2, indicate in the headings 
of the open columns next to the columns of the two-element constraint sets 
which were prepared in Intermediate Step No. 2, the three-element sets S which 
are to be considered next. Compute for each of these FEJ l e s - e T but proceed 
to the Fma Step immediately as soon as such a vector contains nonnegative 
elements only; otherwise proceed to Intermediate Step No. 4 which deals with 
tour-element constraint sets in the same way. 

. requires the consideration of 9 three-element sets. Considering 

m particular the fourth, (3, 6, 7), we find for FE^eg - e T S 
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All elements of this vector are positive, so we proceed to the Final Step. For 
completeness’ sake, the columns of the five remaining vectors [x <3,7,2) , • * • , £ (6,7,2) ] 
are also specified; but this is not required since we can proceed to the Final Step 
immediately after the fourth vector. 

Final Step. As soon as an Intermediate Step has led to an x s which violates 
none of the constraints, verify the hypothesis x s = x by considering all vectors 
x sh . If each x sh violates constraint h, then the hypothesis is correct. If some x sh 
does not violate constraint h, the hypothesis is not correct. In that case one has 
to take up the Intermediate Steps again and to proceed until a new x 8 is found 
which violates none of the constraints, after which the Final Step is applied to 
this x 8 ; and so on. 

In our case there are three sets S h to be considered, viz., (3, 6), (3, 7), and 
(6, 7). It happens that the corresponding vectors have all been considered in 
Intermediate Step No. 2: z C3,6) violates constraint 7 (and also 5), x iZ,7) violates 
6 (and also 2 and 5), and x (Q,7) violates 3 (and also 2 and 5). The conclusion is 
£ (3,6,7) = x; the numerical value of this vector is 


x = {0.400 0.233 0 0.414}, 

as follows from the numerical specification given in Intermediate Step No. 3. 20 
The corresponding Q-value is [see (2.11)]: 


6 

Q(x) = | [18 16 22 20] * 
_0 


1 

10 

1 

4 


8 

1 

17 

3 



0.379 -1.211 0.377TT1.229" 

[1.229 8.508 8.802] -1.211 12.363 -1.636 8.508 = 17.037. 

0.377 -1.636 6.056J |8- 802 - 


5. Concluding Remarks 

Whether the present method is or is not computationally efficient compared 
with other methods of quadratic programming is a question that does not admit 

20 When nonnegativity constraints are imposed on each of the elements of x, the vector 
FE^es — e T (completed with zeros at appropriate places) gives x s immediately. If this 
is not the case, a: 5 is to be found from 

= _ B^CsEs'es = B~\a - C^e*), 

where C$ is the submatrix of C corresponding to the constraints of S. This result follows 
directly from (2.4) and (2.8). 
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a unique answer applicable to all cases. 21 To take an extreme example, 
Houthakker s capacity method works in the simplest conceivable way when 
x = 0 (because this method “starts in the origin”) but it is much worse off when 
the optimal vector is that of the unconstrained maximum. On the other hand, 
the present method is simplest when the latter alternative applies, while it is 
much poorer when % = 0 because this implies that as many as n constraints are 
satisfied in equational form. Generally, the method is simple as long as the con¬ 
strained maximum satisfies few constraints exactly. This follows directly from 
the fact that the successive intermediate steps require the inversion of matrices 
the order of which increases successively. 

There is one important situation in which the method seems to be very ad¬ 
vantageous. Suppose that a quadratic programming problem has been solved 
(by one method or another) and that one is interested in the sensitivity of the 
solution for changes in the constraints or in the coefficients of the objective 
function. As long as such changes are small, there is a good chance that the set 
S of constraints which the new optimum satisfies in equational form is the same 
as the similar set of the old optimum. This hypothesis can be tested in a straight¬ 
forward fashion by means of Rule 3, which means that the initial step and all 
intermediate steps can be deleted if the test turns out to be positive; and the 
final step which is carried out gives then the new optimum immediately. 
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NOTES ON QUADRATIC PROGRAMMING: 

THE KUHN-TUCKER AND THEIL-VAN DE PANNE 
CONDITIONS, DEGENERACY, AND EQUALITY 
CONSTRAINTS* 

J. C. G. BOOT 1 

Netherlands School of Economics , Rotterdam 
1. Introduction 

Let us consider the quadratic programming problem 

(1.1) Max Q(x) = o!x — i x'Bx XiXjbijj 

2t v —1 Z i j J 

under the side conditions 

(1.2) C*'x g d* (£ tits* g d k *; k = 1, • • •, m) 

i 

and the non-negativity conditions 

(1.3) x ^ 0 (xi ^ 0 ;i = 1, * * • , n). 

The inequality conditions (1.2) and (1.3) will be referred to as constraints. 
For any n-vector x, constraints may either be amply satisfied (i.e., the strict 
inequality holds), or binding (i.e., the strict equality holds), or violated. If no 
constraint is violated, the z-veetor concerned is called feasible. It is assumed that 
at least one feasible vector exists. The matrix B is supposed to be positive definite, 
which is a sufficient condition to ensure that Q(x) is bounded above within the 
constraints. 2 Moreover, it then follows easily that Q(x) is strictly concave, and 
hence the solution vector is unique. 

The purpose of this note is to prove that the rules on which Theil and Van de 
Panne [6] based their recent method follow straightforward from the well-known 
Kuhn-Tucker conditions [5]; to propose a method for handling degeneracy in 
quadratic programming, and to consider the case in which there are linear 
equality constraints besides the inequalities (1.2)-(1.3). In the next section 
the Kuhn-Tucker conditions will be quoted; Section 3 will summarize the rules 
of Theil and Van de Panne, and in Section 4 these rules will be proved using the 
Kuhn Tucker conditions. Section 5 will proceed by discussing degeneracy; this is 
followed by a numerical illustration of degeneracy in Section 6. The final section 
considers the case in which some of the constraints take the form of equations. 

* Received February 1961. 

1 The author is indebted to Prof. H. Theil and Mr. P. J. M. van den Bogaard for their 
comment, criticism and help. Moreover, the article has greatly benefited from a number 
of suggestions and corrections of Professor C. E. Lemke. 

2 The condition is a little stronger than necessary. See [2]. 
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2. The Kuhn -Tucker Conditions 

The well-known Kuhn-Tucker conditions assert that £ is a solution of the 
problem if and only if there exists a vector u such that, apart from (1.2) and 
(1.3), 

(2.1) (u k £ Q;k = 1, , m) 

(2.2) u'd* - tfC* f x = 0 

(2.3) Bx + CH - a £ 0 
and 

(2.4) x'Bx + x'C*u — x'a = 0. 

For our purposes it will prove useful slightly to rewrite the above conditions, 
following the development of Barankin and Dorfman [2]. 

Definitions: 


(2.5) v = Bx + C*u — a ^ 0 [compare (2.3)] 

(2.6) y = d* — C*'x ^ 0 [compare (1.2)]. 

With these definitions, the new, equivalent formulation of the Kuhn-Tucker 
conditions may now be stated as follows: find vectors x , u, v and y, all ^ 0, 
such that 


(2.7) 


~ B 

C* 

o 

1 

u 


a 

_C*' 

0 

o 

V 


_d*_ 


LVJ 


and such that v'x + u’y is minimized; this minimum value has to equal 0 [see 
(2.2) and (2.4)]. Vectors x, u, v and y satisfying all these conditions satisfy all 
the Kuhn-Tucker conditions, and vice versa. 

Let us also combine, for the sake of symmetry, (1.2) and (1.3), and let us 
agree to write 


( 2 . 8 ) 



d* 

0 


Here C' and d are (m + n) X n and (m + n) X 1 matrices respectively. We 
can henceforth simply write 


(2.9) 


C'x S d 


instead of conditions (1.2) and (1.3). Moreover, we can write 


( 2 . 10 ) 


y*u + x'v = [d f — x'C\ 


u 

v 


and this is precisely the expression to be minimized. 
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Let us pause a moment to interpret result (2.10). It says that with all (n + m) 
constraints (2.9) we associate a real number such that either the constraint is 
exactly satisfied, or the associated number equals 0. Hence, at most (n + m) 
of the 2(ft + m) elements of u, v, x and y are positive. Degeneracy, by definition, 
will be said to occur when fewer than (n + m) elements will be positive. From 
(2.7) we see that this special case happens when the right-hand vector (a, d) is 
linearly dependent on fewer than (n + m) columns of the matrix 

“5 C* -I 0“ 

JC*' 0 0 I_ * 

Barring degeneracy for the time being, it can be concluded that exactly (m + n) 
elements of u, v, x and y will be positive. 

3. The Theil-Van de Panne Method 

In this section the approach of Theil and Van de Panne will be briefly and 
heuristically explained. They reason that the inequality constraints (2.9) are 
the source of all trouble in the maximizing problem. Indeed, maximizing a func¬ 
tion under equality constraints is not much of a problem, and can be solved 
quite generally and rather simply with the device of Lagrangean multipliers. 
Hence the procedure they suggest for finding the solution vector x amounts to 
the problem of finding the set S out of the (n + m) constraints (2.9), such that, 
when (1.1) is maximized with the constraints belonging to S binding, a vector 
x s results which is feasible and optimal; x s is defined as the vector maximizing 
(1.1) with all constraints belonging to S binding. 3 

They generate this set S in the following way. First, maximize (1.1), never 
mind the constraints (2.9). The resulting vector, say x 0 , 4 either will violate some 
constraints, or else it will obviously be the solution vector. If x° does violate 
some constraint (s), then they proceed to prove that x, the solution vector, binds 
at least one of the constraints violated by x°. Or, in other words, if the supremum 
is taken on at some constrained point, then at least one of the constraints binding 
in this point will be violated by x°. This is Rule 1. Hence, they next consider al] 
one-element sets S consisting of equations violated by x°. If no resulting x s is 
feasible, all two-element sets S are considered, of which the first element is one 
that was violated by x°, say constraint h, and the second element some constraint 
violated by x h , say constraint k. And so on. The successive steps of this procedure 
are determined by the authors’ Rule 2, which states that if x s = x, and if S' C S, 
but S' 5* S, x s will violate at least some constraint in S — S'; that is, if the 
constraints S' which are imposed to be binding are indeed binding for the solu¬ 
tion vector and if there are some constraints which are not imposed to be binding, 
but which are nevertheless binding for the solution vector, viz. S — S', then 
x s will violate at least some constraint of the latter set. 

3 Such a vector exists provided that the constraints belonging to S are not inconsistent 
when written in equational form. 

4 Note that afi also fits the definition of X s if one interprets the “0” as the empty set. 
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Suppose, next, that we have succeeded to find a set $ in this way such that 
x 8 is feasible. By hypothesis, such a set S exists, and by our procedure and 
Rules 1 and 2 we are bound to hit upon it at some stage. The question remains, 
whether this feasible x 8 is the optimum vector such that 

(3.1) x 5 = f. 

Rule 3 of [6] then states that, under condition (3.1), for any h £ S, the vector 
x s ~ h is such that upon substitution in (2.9) the constraint A is violated: ^2iC hi x 8 ~ h 
> dh . Here, the vector x s ~ h is the vector maximizing (1.1) with all constraints 
belonging to S , apart from the deleted constraint h, binding. 

4. The Proof 

In this section we will present a mathematical proof of the Theil-Van de 
Panne rules, using the Kuhn-Tucker conditions. In the next section, following a 
discussion on degeneracy, a more intuitive argument will be used to prove the 
same. 

For any consistent set S , we can easily compute x s in a self-explanatory nota¬ 
tion. Write 

(4.1) Q(x, X) = a'x - \x’Bx - X',(CV - d 8 ), 
then 

(4.2) d Q( x,\) = a _ Bx _ cx 

dx 

gives an expression for x s upon equating (4.2) to 0: 

(4.3) = B~ l a - B-'CbXb . 

Since x° = B~ x a, we have 

(4.4) x s = x° - B-'CbXs . 

To obtain an expression for the vector X s , the Lagrangeans associated with 
the strict equalities, we premultiply (4.4) with C' s : 

(4.5) C'bB-'CbXb = C'sx - d B , 

where use has been made of the fact that C' s x s = d 3 . Hence 

(4-6) = (C'aB-'Csr'iC'aX 0 - d s ). 

Now we have seen in Section 2 that the Kuhn-Tucker conditions say that 
when x = x, all components of Xs are positive (in the absence of degeneracy). 
Define 

(4.7)‘ Ps = C , a B- 1 C s , 

then P a is a strictly positive definite matrix when the columns of C s are inde¬ 
pendent. 5 This can safely be assumed. In fact, it follows from the assumption of 

5 See, e.g. Zurmuhl [7, p. 133]. 
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nondegeneracy. Rule 1 follows directly from (4.5), for 

X' 5 P S X 5 = X 8 (C'sZ° - da) > 0. 

Since X 5 is a strictly positive vector, at least one element of C'sx° — d 8 is positive, 
and hence at least one constraint binding in the solution is violated by x°. This, 
at the same time, proves Rule 2. For the set S' which is imposed to be binding 
in x s and which is also binding in x s = x can be used to eliminate an appropriate 
set of variables from the objective function, after which the new objective func¬ 
tion is to be maximized subject to the old constraints except that those in S' are 
deleted; and in this new problem x s plays the role of x°. In Section 7 we will use 
a similar approach. 

As for Rule 3, still assuming x s = x, we will show that deletion of one constraint 
from S would lead to its violation in the resulting vector x s ~~ h . Assume constraint 1 
(1 € S) is deleted and let us partition P s as follows: 


(4.8) 


C'sB-'Cs = Ps = 


Pn Q 

P ii J 

where pn is the leading element of P a . We need to prove that 

(4.9) C^x 8 - 1 - 4 > 0, 
or, with (4.4) and (4.8) 

(4.10) C\x 3 - efc - q'\ 8 -i > 0, 


or, with the first element of (4.5) and (4.8) 


(4.11) 


[pn t/jXs — q'\s ~i > 0. 


From (4.6) and (4.8) we immediately get 
(4.12) X« = Pn ( C's~i%° - d a - 1) 


or, in view of (4.5) again 

(4.13) \ a -x = Pn[q Pn]\s = [Pnq I]\s • 


Upon substitution of (4.13) into (4.11) we obtain for the left-hand side of 
(4.11) 

(4.14) bn q']\ a -q'[PTh I]\ 8 . 

Partition X s = [£■] , then we can write (4.14): 

(4.15) P11X5 1 + q'^s 1 ~ q'PnqXs 1 q'^s 1 = (pn ~ q'Piiq)\s . 

This expression should be positive, given that X/ is positive. Bug, clearly, 
P 11 — q'Piiq > 0. For we have, for any column x 9 * 0 of appropriate order, 
x'P s x > 0 because P s is positive definite; take x' = [—1 g'Pil 1 ], then 

0 < x'P s x = p u - q'Puq; 

which means that Rule 3 holds whenever Kuhn-Tucker’s X s > 0. 
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Conversely, it follows from x s = x and C'xt~ k > d k (for all k € S) that all 
\s are positive. For we can reverse the proof immediately. We know that the 
left-hand side of (4.9) is positive, hence (4.15)—which is nothing else than (4.9) 
rewritten—is also positive; in these expressions 1 can be replaced by any k £ S. 
Also, inequalities of the type p n - q'PTiq > 0 hold generally, hence X B * > 0. 

The result of all this is that Rule 3 of [6] is equivalent to the Kuhn-Tucker 
condition, and hence maybe reformulated: x s = x, if x s is feasible and X B > 0. 
All these results were derived under the assumption of nondegeneracy. Let us 
now consider degeneracy. 


5. Degeneracy 

5.1. Degeneracy and Perturbations 

In the previous discussion we have explicitly excluded the possibility of 
degeneracy. Fortunately, this is not a serious limitation, for we can easily allow 
for degeneracy. In such a case we bring to bear a fundamental theorem, proved 
in [2] and [3], asserting that if the original problem is feasible, then there will 
always exist another problem, with slightly different parameters a* and d*, whose 
solution is arbitrarily close to the solution of the original problem (| x* — x\ < e) 
and which is such that not fewer than (n + to) elements of u, v, x and y are 
positive. 

All the same, it may be useful to indicate the very special character of de¬ 
generacy in some more detail. Since under degeneracy fewer than (n + to) 
elements of u, v, x and y are positive, there is at least one pair , *,■) or (u k , 
y k ) which is equal to (0, 0). We can give two alternative, essentially identical, 
interpretations of such an occurrence. Either, some x { = 0 or y k = 0, though 
Xi was not imposed to be 0, or the kth constraint of (1.2) was not imposed to 
be binding. Or alternatively, for some Xi or y k imposed to be binding the as¬ 
sociated Lagrangean equals 0. In general, we should then write for any S such 
that x s = x: 


(5.1) 



^ 0 . 


In accordance with the latter interpretation, let us assume that 1 £ S, but 
Xs = 0. Then we will first show that x s = = x, which implies that the 

solution vector remains unique, even though the set S generating this solution 
need not be unique. From (4.4) we have 


(5.2) 


X S = X° - B~\C 1 Cs-y] 


Xs 1 

xr 1 


= X° - R-^s-iXr, 


using the partitioned notation introduced before, and Xs 1 = 0. Also from (4.4) 
(5-3) a: 5 ' 1 = x° - R-^s-iXs-x. 

Hence 


(5.4) 
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when it is true that 


(5.5) Xf 1 = As_i, 

given that A s 1 = 0. To prove (5.5), write, using (4.5) 


(5.6) 

Hence 

(5.7) 


C's-iB-Wi (Vd 


As 

L^J 


= C's-i x° — d S -i . 


Af" 1 = ( C'a-iB^C s_i ) -1 ( C's-iX 0 - d a -1 ), 


and from (4.6) we immediately see that the right-hand side of this expression 
equals A s _i. 

It may be useful to illustrate all this by means of a simple picture, taken from 
[6]; see Fig. 1. The two alternative interpretations in this concrete example 
now take the following form. First, taking S — {1}, we obtain x 1 = x, a vector 
which violates no other constraint, though it happens to satisfy constraint 3 
exactly. Second, taking S = {1, 3}, we obtain x 1,3 = x, which has as associated 
vector 


Al8 > 0 
A?3 = 0. 


Let us compare these two events with the “normal” situation. In the absence 
of degeneracy, with the strict inequality holding in (5.1) for all A £ S, —and, 
it should be added, S c^xf = d h for no h $ S —we have the simple result that 



Figure 1 
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a small increase in d h (for any h £ S) increases the value of Q(x 8 ), while a 
small decrease “hurts” the value of Q(x s ). This follows from (5.1). Moreover, 
no change in d h , if small enough, influences the feasibility of x s , as follows from 
the fact that for all constraints h there was some “spare” room. Under con¬ 
ditions of degeneracy these simple statements no longer hold. First, consider¬ 
ing x 1 again, we see that a slight change in di may impair its feasibility. 
This follows from the fact that UtiCuxl = d z , and hence the smallest change in 
di , leading to the minutest change in x 1 , could make ^iC Zi xl > d z . Second, 
considering x 1, 3 , note that any change in dz —the constant term of the 
binding constraint with associated Lagrangean equal to 0—, “hurts” the 
value of Q(x lf 3 ). This may be shown as follows. Since maximizing a function 
under more constraints never increases the value of the maximand, we have 
Q(x" 3 ) S Q{x). The equality holds for the very special case that, whether 
explicitly stipulated or not, constraint 3 is binding anyway, which happens 
when Xu equals 0. However, from (4.6) we have 


dXn 

ddz 


= -C'iJTV,, 


where the right-hand side is a negative constant. Hence Xu equals 0 for but 
one value of d z , and even the smallest change “hurts” by resulting in a pos¬ 
itive or negative value of Xu . It is worth-while to add that the figure also il¬ 
lustrates that a slight change in d x or d z “dedegenerates” the situation again. Our 
results are summarized in Table 1. 

Notice in particular that the resulting changes in x can be made arbitrarily 
small, and that after a perturbation the set S giving the solution vector is 
unique in all 4 cases. 


5£. Further Comment on the Proof 

It may be useful to give an intuitive argument for the proof in Section 4 using 
the discussion on degeneracy. Let us exclude the case of degeneracy. Then 
the set S such that x 8 = x is unique, and all X s h (h G S) are strictly positive. 
Consider then (4.6) again, and write, substituting (4.7): 

(5.8) X s = F?(C' a x° - d 8 ) 
and hence 

(5.9) ^ = -PJ\ 

da 8 

TABLE 1 


Results of Small Perturbations on Degeneracy 


Perturbation 

Vector x l 

Vector 2c l » * 

Value Q (£) after 
perturbation 

Increase di 

solution 

lower Q(x) value 

greater 

Decrease d \ 

not feasible 

solution 

smaller 

Increase dz 

solution 

lower Q(x) value 

no change 

Decrease dz 

not feasible 

solution 

smaller 
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Since P s is positive definite, we have, for all A £ S, 

-jJL = — p hh = a negative constant, 


where in the usual manner the superscripts denote elements from the inverse of 
Ps ■ Equivalently 


(5.11) 


&Q(x, X) 
ddh 


x/ > 0 and 


d 2 Q(x, X) ax/ 

3 (< 4) 2 dd h 


Hence, increasing dh(h £ S) increases the value of Q(x s ) at a decreasing 
rate. Furthermore, according to (5.10), there will always be an increase in 
dh such that Xa becomes 0. Call this increase d*o > 0. Now we have indicated 
in our discussion on degeneracy, that after adding d m to d h , so that \ s h = 0, 
it does not make any difference whether we maximize Q(x) nndpr constraints 
belonging to set S including h or excluding h, see (5.4). Disregarding constraint 
h from the original set S, giving the vector x 8 ~ h , does increase Q(x) in exactly 
the same way as an increase of d k with d M would, while keeping constraint h as 
an equality (^Ch&i = <4 + <4o). They produce the same vector x s ~ h . But now: 


(5.12) C'hX h — d k “b dha > dh , 


and hence we see immediately that for each h £ S maximization of Q(x) while 
disregarding constraints h leads to violation of the ftth constraint. 

Finally, it is of some interest to observe from (4.8) and (5.10), that 


(5.13) 


ax/ _ ax/ 

ddi ddh 


Interpreting the X/ as the shadow price of the “source” d h [this well-known 
interpretation is essentially based on (5.1)], this equality says that the in¬ 
fluence of an increase in the supply of the zth source on the price of source d h 
is equal to the influence of an increase in source d h on the price of source . 
Unfortunately, there are no general rules on the sign of this expression. 


6. Numerical Illustration of Degeneracy 

As a numerical example of the Theil-Van de Panne approach, consider the 
following example borrowed from Houthakker [4]. He considered a problem 6 
which arose in the context of a monopolist facing linear demand functions for 
each of his four products. When we consider the monopolist as a quantity- 
adaptor it is useful to invert the demand functions, to get 

(6.1) p = o + §i?x, 

giving the price vector as a linear function of the quantities. Assume the mo¬ 
nopolist strives to maximize total revenue p’x, but that his freedom of action 
is limited by the fact that three factors of production are scarce, apart from 
the obvious limitation that no negative quantities can be produced. Under these 
conditions, we can specify the matrices a and B of (1.1) and (6.1) as follows: 


6 The problem is slightly changed to incorporate degeneracy. 
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and the matrices C' and d of (2.9): 


0 -1 
0 0 


'1392/1330' 

2 

3 

0 

0 

0 

0 


These constraints will be numbered consecutively from 1 to 7. The matrix B 
is positive definite. 

To prevent rounding errors all computations have been made without di¬ 
vision. First, let us check whether x° is feasible: 


x° = B-'a = 


13296" 

1384 

-3584 

5776 


Substitution in (6.2) shows that the constraints 1, 2, 3 and 6 are violated by x° 

18 380 688" 

32 994 640 
34 138 440 
-17 683 680 . 

-1 840 720 
4 766 720 
-7 682 080_ 

Hence we try next x , x , x and a: 6 . Should one of these vectors be feasible, we 
would have found the solution vector, for Rule 3 is complied with. However, 
computations show that not even one of these vectors is feasible. All necessary 
computations can be made with the aid of (6.5) and the matrix C'BT'C: 

981 1365 1971 —855 —114 291 —303" 

1365 18025 — 2385 — 75 —10 —1765 485 

, N 1 1971 -2385 8829 -621 -666 549 -1233 

(6.6) C'B C = —855 — 75 —621 1521 —186 — 753 273 . 

— 114 —10 —666 —186 364 94 —158 

291 -1765 549 - 753 94 553 -185 

.-303 485 -1233 273 -158 -185 373_ 
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To check for feasibility first compute X s from (4.6). For example: 


Xi 


2916 

981 


1 

(2916) (1330) 


- (18 380 688), 


and similarly for all positive elements of (6.5). Since (6.6) is positive definite, 
all one-element Lagrangeans are positive. Then, for x s , say, to be feasible, we 
should have for all constraints not in S, say S; 

C' s x s S d s , 


or, with (4.4) 

(6.7) C' s x s = CV - C'aB-'CsKa ^ d s , 
or again 

(6.8) C'sB-'Cska - ( C'sx 0 - d s ) ^ 0. 

The matrix C'sB^Cs can be read immediately from (6.6) and C'sx 0 — ds from 
(6.5). And Xs is computed from (4.6), as said above: 

(6.9) Xs = ( C'sB^Cs)- 1 ( C'sX 0 - d s ). 

If any component of Xs so computed is negative, the vector x B , even though it 
may be feasible, is not the solution vector. It turns out that no vector is fea¬ 
sible. For example, a: 3 violates 1, 2 and 6. Hence, consider all 2-element sets S 
with a first element equal to a constraint violated by x (i.e., 1, 2, 3 or 6) and a 
second violated by the corresponding vector, in this case respectively x 1 , x, x 
or x. For example, consider S = {3,6}. It turns out that x 3 ’ 6 violates 1,2 and 5. 
Similarly, it turns out that no 2-element set S produces a feasible vector x . 
Continuing with 3 elements sets, we have to take, among many others, S = 
{2, 3, 6}. With the benefit of hindsight, we will consider this case in some detail. 
Formula (6.9), taking S = {2, 3, 6}, gives: 


2916 

X236 (56 545 322 400) 


4 

581 

036 

349 

920 

14 

273 

820" 

1 

"24 

808 


349 

920 

6 852 

600 

-5 

686 

200 

^ 901 A * 

25 

668 

14 

273 

820 

-5 686 

200 

153 

454 

500_ 

Zi\y 10 

3 

584 


1 

(56 545 322 400) ' 


173 

785 

458 

528 

164 

194 

011 

360 

758 

132 

472 

960 


1 

6650 


'20 438 
19 310 
89 160 


Checking feasibility with (6.8), using the relevant elements of (6.6) and (6.5): 
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(6650) (2916) 


20 438 
19 310 
89 160 


91 903 440 
-88 418 400 
-9 203 600 
-38 410 400 


(6650) (2916) 


10 822 260 
4 928 560 
801 900 


Hence x 2 ' 3 ' 8 provides the optimal solution: all Lagrangeans positive and (6.8) 
non-negative. 


B 1 Ct3t Xs 


' 4/10 
31/133 
0 

55/133 


That the solution happens to satisfy constraint 1 exactly, even though this 
was not stipulated, indicates degeneracy. In fact, considering S = {1, 2, 3, 6}, 
we get for Xme the exact vector: 


Xl2M — 


(6650) (123 974 556 480) 


2 533 791 985 338 240 

2 393 948 685 628 800 

11 053 571 455 756 800 


1 20 438 

6650 19 310 
89 160 


TABLE 2 

Solution Values of the Kuhn-Tucker Variables 




rv-io —NOTES ON QUADRATIC PROGRAMMING 


161 


where we needed to invert the 4X4 matrix C'msB^Cme . Obviously (5.5) 
holds, and similarly (5.4). In practice, degeneracy is likely to escape notice 
through rounding errors. Summing up the complete solution in Kuhm-Tucker 
notation we can collect our results as in Table 2. 


7. How to Handle Equality Constraints 

It may happen that there are a number of equalities (rather than inequalities) 
included among constraints (1.2) Let us assume that we have k equalities: 

(7.1) Ex = f (£ eh&i = h ; A = 1, - • • , fc). 

i 

The rank of E can be assumed to be equal to Jc; if it were not, either (Jc — 1) 
equations would suffice to give the same information, or else the system would 
be inconsistent. Assuming the leading submatrix of order Jc X k to be nonsingular, 
the most straightforward approach appears to be to eliminate the first h variables 
Xi(i = 1, • • • , fc), by expressing them in the remaining variables. Partitioning: 


(7.2) 


(Ei E2) 


Xi 




= /, 


where E x , E 2 , Xi and X 2 are of order Jc X Jc, Jo X (n — Jc), Jc X 1 and (n — Jc) X 1 
respectively. Solving for Xi : 

(7.3) X! = Ei l j - ET 1 E 2 x 2 . 


Substituting these expressions in (1.1) and (2.9) has the disadvantage of 
losing the specially simple form matrices B and C may have in the original 
problem; as an advantage there are fewer variables. Another advantage is that 
this procedure always “works,” since the new matrix B* say, of order 
(n — Jc) X (n — Jc) will always remain positive definite: substitute (7.3) in 
( 1 . 1 ): 


(7.4) 


a 7 


'El 1 / - EtE 2 xl 

X 2 


i[/'(Br 1 ) / -x f 2E'2(E?y *' 2 ] 

\B kh B m 1 (Eff - Ei 1 E 2 xl 
fc ,k B n — k ,n—Jfc_| \ , 


The latter part of this expression consists of a constant, an expression, linear in 
x 2 , and the quadratic form 


s'* [MI 1 )' I ] 


Bkk 

_Bn~k ,k 


Bk ,n—k El X E 2 

L I J 


The (n — Jc) X (n — Jc) matrix of this quadratic form will again be positive 
definite, since the rows of the premultiplying matrix and columns of post- 
multiplying matrix are linearly independent, see footnote 6 above. 

Alternatively, we could proceed in line with the Theil-Van de Panne approach; 
but then we should not originally start at the point of the unconstrained maxi- 
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mum, x°, but at *», say, where E is the set of k equalities. Formulae (4.1) to 
(4.6) remain valid. The situation is, in fact, very much similar to a situation 
we have in the Theil-Van de Panne approach after some steps have been made. 
Very much, but not quite. The difference is that when equalities are present the 
Lagrangeans associated with them need not necessarily all be positive. The 
situation is completely analogous to the unsymmetric dual problem in linear 
programming; as will be recalled, the dual of a primal subject to equality rather 
than inequality constraints amounts to finding a vector the elements of which 
are not restricted to be non-negative. Referring to Figure 1, suppose constraint 
3 is imposed as an equality. The associated X 3 S is negative [for a decrease in d 3 
would increase Q(x)]. X? 3 equals 0. X? 3 evaluated at d 3 + « is negative and Xj 3 3 
evaluated at d 3 - «is positive. Clearly, the associated Lagrangeans may change 
m sign as the set E is increased by more constraints, taken from the inequalities. 
However, the “^-Lagrangeans” may have, at any stage, any sign, while we 
specifically choose the constraints belonging to S such that X s * > 0 for all i, 
at a: = x, as we pointed out in Section 4. 
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A METHOD OF SOLUTION FOR 
QUADRATIC PROGRAMS* 

C. E. LEMKE 

Rensselaer Polytechnic Institute , Troy , New York 

This paper describes a method of minimizing a strictly convex quadratic 
functional of several variables constrained by a system of linear inequalities. 
The method takes advantage of strict convexity by first computing the ab¬ 
solute minimum of the functional. In the event that the values of the variables 
yielding the absolute minimum do not satisfy the constraints, an equivalent 
and simplified quadratic problem in the ‘Lagrange multipliers* is derived. 
An efficient algorithm is devised for the transformed problem, which leads 
to the solution in a finite number of applications. A numerical example illus¬ 
trates the method. 


1. Introduction 

We are concerned with the general quadratic programming problem posed 
in the following form: Minimize 

(1) — a T x + %x r Q&, 
over the set of all x satisfying: 

(2) A T x<do . 

Here matrix A has order m X n; Qq is m X m; a and x are m X 1; and do is 
n X 1. Superscript l T denotes matric transposition. It is assumed that Qq is 
positive definite and symmetric. 

Ignoring constraints (2) momentarily, the minimum of (1) is taken on at 
the unique point 

(3) x 0 = Qo l a, 

where the gradient Q& — a vanishes. If, further, xq satisfies the constraints 
(2), it must then solve the quadratic problem. If not, the optimum is taken on 
for some point x on the boundary of the convex polyhedron described by the 
constraints (2). 

In a recent article [7], Theil and van de Panne appear to have been the first 
to utilize the non-singularity of Qq , and, starting from knowledge of x 0 , to 
systematically search out the optimal boundary point. As the authors noted, 
such a technique does not require a feasible point to initiate the calculations. 
Their technique is compared briefly at the end of this paper with the one de¬ 
scribed below. Actually our method appears to resemble more that proposed by 
Beale in 1955 [1]. 

In Section 2, we derive an equivalent problem, which is immediately feasible 
and permits a simple algorithm. Following Section 3, where the algorithm is 
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described and necessary proofs given, an example, also used by Theil and van 
de Panne, is furnished. 


2. Initial Transformation 

The Kuhn-Tucker geometric conditions for an optimum state that a feasible 
point a; is a solution if and only if the gradient (Q<x - a) at z is a non-positive 
combination of the outward-directed normals of those support hyperplanes, if 
any, containing x; that is, if and only if there is some y such that: 

(4) (Qa - a) + Ay = 0; y £ 0; and y T (ck - A T x) = 0; 

the last condition ensuring that only those support planes containing x are 
considered, y = 0 would correspond to the point x 0 , which would be the unique 
solution if it satisfied the constraints. 

In the first part of (4) we may solve for x in terms of y uniquely: 

( 5 ) x = QiT^a — Ay), 

from which the final solution may be computed, using an optimal set y of ‘La¬ 
grange multipliers’. 

Using (5) we may eliminate x from (2). Then defining: 

(6) Q = A T Qa t A] d = do — A T x o; and z = d + Qy, 

we may pose the Mowing problem which, because of (5), is fully equivalent 
to the original: 

Find a pair of vectors y and z which satisfy: 

~Qy + z — d; y,z^ 0; and y T z = 0, 
where Q is symmetric and non-negative definite. 

Let us now note that, by the Kuhn-Tucker conditions, this problem is en¬ 
tirety equivalent to the following quadratic problem in the variable point v 
Minimize ' 

(&) f(y) = d T y + \y T Qy, subject to y ^ 0, 

where Q and d represent the given data, and Q is a symmetric and non-negative 
definite matrix. 

It is this ‘derived’ problem, which has some interest in its own right, which 
shall be solved. Having an optimal y, the optimal x for the original problem is 
obtained as in (5). 

Note that a ‘feasible’ point, i.e., one satisfying the constraints y ^ 0; namely 
y = 0, yielding /(0) = 0, is immediately available, and is indeed the only ex¬ 
treme point of the constraint set (the non-negative orthant), which is just a 
simple cone, y = 0 would yield the solution if and only if d ^ 0 as (7) shows 
Finally, with reference to the method to be described, note that no assump¬ 
tions as to strict convexity of f(y), or any assumptions as to ‘degeneracy’ usually 
required for linear programming methods are made. We require merely that a 
finite solution exists. 
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3* The Algorithm 

Preliminaries 

With reference to the conditions (7) for an optimum, y and z form an optimal 
pair if and only if (i) y is a point in the non-negative orthant; (ii) z, the gradient 
of / at y, points into the non-negative orthant; and (iii) y is perpendicular to 
its corresponding gradient z. 

Throughout the calculations condition (i), starting with y = 0, is retained, 
and as the iterations proceed the functional / will be non-increasing. After a 
finite number of iterations, the calculations will end with conditions (ii) and 
(iii) satisfied. 

Associated with an iteration is a point y and a set of n independent directions. 
As in the extreme point methods of linear programming, these directions appear 
as the rows of the inverse of an n X n ‘current basis matrix’. In going from one 
iteration to the next, the current basis matrix is altered by replacing a single one 
of its columns by some other column, and the required ‘current inverse is ob¬ 
tained by algorithm from the inverse associated with the previous iteration. The 
original data Q and d are retained throughout, and all calculation is based on 
the computed inverse. 

We adopt the following notation. If B is the current basis matrix we write: 

(9) B = (6i, 6 2 , ••• , b n ), 

so that hi denotes the i th column of B. Further we write: 

(10) CB _1 ) T = (6\ b\ ■ ■ • , &*), 

so that 6‘ denotes the row of B~\ written in column form. The statement that 
BBT 1 = 7, the identity matrix of order n, is then equivalent to the n scalar 
conditions: 

(11) 6/6’' = 1; 6,V = 0; for j. 

We denote the i th column of I by e ,, which is thus a column with 1 as its i th 
component and 0 as its other components. The constraints y ^ 0 may be ex¬ 
pressed in scalar form as e, T y ^ 0; i = 1 , 2, • - • , n. 

Now as to the composition of the current basis matrix B, during an iteration 
some components of the current feasible point y will be 0; that is, for some values 
of i, y will lie on the ‘bounding hyperplane’ e/y = 0, with positive normal e<. 
For some of these i (initially, all of them) the corresponding vector * will appear 
as a column of B. If for any i, we have e,- as a column of B and also a y = 0 we 
shall (following Beale [1]; see also [6]) call such a column of B a ‘restricted’ 
column. On the first iteration, all columns of B are restricted columns and in 
fact B = I. The other columns of B are ‘free’ columns, and are generated by the 
algorithm as described below. 

Thus, for each iteration there will be a subset R of the set of integers from 
1 to n which will specify which columns of B are restricted columns. 

The current basis B will change by one column in going from one iteration to 
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the next. We will denote by b r the column to be replaced, and by b r the re¬ 
placing column. 

Finally, the criterion for continuing the iterations is based on the values of 
the current w expressed by: 

(12) w = BT x z = BT l {d + Qy ). 

One Iteration 

At the start of the iteration one has the following computed data: 

а. BT 1 (whose i th row is b %T ) 

б. y ^ 0 (the current feasible point) 

c. z (the gradient of / at y) 

d. R (specifying which columns of B are restricted). 

The above set of data is obtained by algorithm from the previous iteration’s 
data. In the following, underscored items will refer to the new iteration. 

To initiate the calculations, B = BT 1 = /, so that 6* = hi = d ; y = 0; 
w == z = d; and R is the whole set. An iteration consists of selecting a column 
of B to be replaced; selecting a column subsequently to replace it; and modify¬ 
ing the data. 

Selecting b T Compute: w = B~ l z . 

If the pair of current solutions y and z are not optimal there are two cases: 
Case L Some component of w corresponding to a free column of B is not zero. 
Case II. All components of w corresponding to free columns of B are zero, but 
w ^ 0 does not hold. 

If neither of these cases hold we are finished, as will be shown. If either case 
holds there will be some value of i, which we label r, which singles out the r th 
component o> r of w. In Case I, a> r is not 0, while b r is a free column. In Case II, 
co r is negative, while b r is a restricted column. In either case, b r is selected as the 
vector to be replaced. In the event that more than one value of i qualifies for 
i = r , we may select any value. However, for the sake of definiteness we shall 
select that value of r for which co r has the largest absolute value. 

Selecting b r . We seek a new feasible point of the form: 

(13) y = y - 6b r , 

where 6 is so selected as to minimize / on that part of the line y — 6b r which 
remains in the constraint set y ^ 0. The selection of 6 will determine b r . Thus 
we determine 6 as follows: Compute: 

(14) q = Qb r , 

Now q = 0 or not. First suppose g ^ 0, and compute: 

(15) 0o = <*>r/q T b r . 

The functional is minimized for this value of 6. In fact one easily verifies that: 
f(y - eV) = f(y) - k W - (* - *) 2 ] 

(= Ky) — So) r when q = 0). 


(16) 
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To retain feasibility, we then compute: 

(17) «o = Min. 

6 0 eib r 


where the minimum is taken over those i for which the denominator is positive. 
If there is no such i we take to as infinite. 

We take 8 equal to So if to > 1, and 6 equal to to8 0 if Zo < 1. In the latter case 
it is possible that 8 equals 0. 

When 8 = t Q do there is some value of i, which we label k, such that: 


(18) 


T 

e k y 
e k T b r ‘ 


If more than one value of k is possible, we select any one arbitrarily. 

If q = 0, then: 

f(y - 8b r ) = f(y) — 8(d T b r ) = f(y) - 8u r , 

so that any non-zero 8 having the sign of o r will decrease /. Then, from our 
assumption of finite optimum, for some 8 having the sign of o) r , y — 6b r will 

strike the boundary of the set y ^ 0. Thus, when q = 0 we take 8 0 in (17) as 

signum co r , and 8 = t 0 8 Q . 

The value of 8 then defines four cases and the replacing vector as follows: 

Case la or Case Ila : 6 = $o . Then b r = Qb r = q 
Case lb or Case lib: 8 — e k T y/e k T b\ Then b r = e k . 

Modifying the data : 

a. jB - 1 , whose rows are given by the formulas: 

(19) b r = (1 /b r T b r )b r ; b i = b l - (£**>*)¥ for i ^ r. 

b . y = y — ob r . 

c. z = z — Oq. 

d. R is obtained from R by (i) no change for Cases la or lib; (ii) including 
r for Case lb; and (iii) deleting r for Case Ila. 

This completes the description of an iteration. 

Proof of Convergence 

Recall that we are considering the system (7), and that we may refer to the 
minimization of f(y) as in (8). We note again that the only assumption made 
is that a solution exists (or equivalently, that the minimum of f(y) fory^O 
is finite). The lines of proof follow quite analogously those given in [6], but for 
the sake of the differences are repeated here. 

Now the minimum is taken on either in the interior of the positive orthant or 
on its boundary. It is taken on in the interior if and only if there is a point 
y > 0 for which the gradient d + Qy at y vanishes. If in the course of the itera¬ 
tions some such point is arrived at we are finished. That is, if ever the set R 
associated with an iteration is the empty set, and Case II occurs (i.e., 2 = 0) 
then we are finished. If the set R is not empty, it specifies the boundary of the 
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positive orthant we are working with. Now (12) may be written as: 

(20) Z =» T) (Oibi , 

i=l 

expressing the gradient of / at y in terms of the current basis. Suppose that 
ut = 0 for each i corresponding to a free column. Then (20) expresses z only in 
terms of columns of I, and further, y T w = 0. Hence, by the Kuhn-Tucker con¬ 
ditions, (or in this case by the classical Lagrange multiplier conditions), the 
current y solves the problem: ’ 

(21) Min. f(y) subject to e { T y = 0, for all i in R; 

that is, y yields the mi n imu m over the face defined by R. If further we have 
w ^ 0 for the current w then, again, the Kuhn-Tucker conditions show that 
y solves the problem: 

(22) Min. f(y) subject to efy ^ 0, for all i in R, 

and hence, a fortiori, solves the problem (8). This will be the case when neither 
Case I nor Case II occurs. If w ^ 0 does not hold, then we have Case II. Thus, 
whenever Case II occurs the current y minimizes / on that face of the boundary 
specified by R. Thus, in particular, whenever Case II again occurs, and in the 
interim the functional / has been decreased, / will have been minimized on a dif¬ 
ferent face, represented by a different R, (or, as we shall say, a better R) . Now 
in going from one iteration to the next, / is not increased, so that one only goes 
to better R’a. Since the number of sets R is finite, only a finite number of se¬ 
quences of events: Case II—decrease in /—Case II is possible. 

Convergence of the process will therefore follow when the following facts are 
demonstrated: 

Lemma 1 : When Case I occurs, Case II will occur in a finite number of iterations 
unless the optimum is reached. ’ 

Lemma 2: When Case II occurs, there will be a decrease in / followed by a recur¬ 
rence of Case II in a finite numbei of iterations, unless the optimum is reached. 
Proof of Lemma 1 

When Case la occurs, the number s of free columns remains fixed. When 
Case lb occurs, the number s of free columns is decreased by one. Therefore if 
we show that when Case la occurs with s free columns it can continue to occur 
for at most s consecutive iterations, the recurrence of Case I can only continue 
to the situation where s = 0. But this corresponds to the initial iteration y = 0 
with /(0) - 0. But, unless d ^ 0, in which case we are finished at the initial 
iteration, this situation is impossible. Thus we will have shown that consecutive 
occurrence of Case I must lead to Case II or the optimum. 

Now when Case la occurs, with b r as the vector being replaced, and with 
r r “ Qb af the replacing vector, we show that the r* component of the new w 
is zero, and that when Case la continues to occur, those components of w which 
nave become 0 in this way remain 0. 
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For this, consider a basis B with the property that for some i we have hi = 
KQb 1 for some scalar K. Since for j i we have 0 = bi T b 3 = b tT Qb J , if for some 
r i we have a replacing vector b r = QJ> r , as in Case la, the new i th row of the 
inverse remains unchanged: b l = f>* — (b r T b l )b r = 6*. 

Now consider 2 as in (20). The new z is given by: 

(22) z = z —doQh r = 53 — 0o(& ir Q& r )]&i. 

t=i 

The new coefficient of bi in this expression is the i th component of w* The co¬ 
efficient of b r is 0 by definition of 6 Q , and for each i such that we had (i) oh = 0 
and (ii) bi = KQb 1 for some constant K we retain (i) co; = 0 and (ii) b l — b\ 
Thus, when Case la continues to occur on consecutive iterations, some addi¬ 
tional component of w corresponding to a free column will become and remain 0. 
Hence Case la cannot continue to occur for more than $ iterations. This proves 
the lemma. 

Proof of Lemma 2 

Suppose Case II occurs. Consider Case Ila. The functional is definitely de¬ 
creased, since by (16) we then have: 

(23) f(y) = f(y) - %(b TT Qb r )6o. 

Note that since Q is non-negative definite and symmetric, a T Qa = 0 if and only 
if Qa = 0. Now on the next iteration either the optimum is obtained or not. If 
Case I occurs, Lemma 1 shows that either Case II occurs in a finite number of 
iterations or else the optimum is obtained. In either case, the sequence Case II 
decrease in /—Case II occurs in a finite number of iterations, unless optimality 
is obtained. 

Consider Case lib. It is possible that in this case there is no decrease in / in 
proceeding to the next iteration. We show that Case I always occurs on the 
iteration following this case. With 2 expressed as in (20) we are supposing that 
all coefficients of bi for i corresponding to a free column are 0. Since z does not 
change when 6 is 0, and b r = eh , we have: 

(24) z = 53 ~ (o) r /ek T b r )ek T b l ]bi + (a h/&k b)ek , 

1=1 

expressing z in terms of the new basis, and specifying the new components of w. 
Since e k m arm ot, be expressed in terms only of the restricted columns (which are 
all columns of I), some component ejtf for which the i th column of B is free is 
not 0, and the corresponding component of 6,- in (24) is not 0. Hence Case I 
occurs on the next iteration. 

Now if Case la occurs there is, by (23), a definite decrease in /, whereas if 
Case lb occurs it is possible that still 6 = 0. But when Case lb occurs, the num¬ 
ber s of free columns is decreased by one. Now the only way to retain no decrease 
in / is to have a sequence of iterations with Case lb and Case lib only occurring, 
and each time with 6 = 0. But since Case lb always follows Case Hb in this 
situation, and since s is decreased each time Case lb occurs, this sequence must 
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terminate with either the optimum, a definite decrease (Case la or Ha), or 
s = 0, which, as noted in the proof of Lemma 1, is impossible. 

Hence, in either case, when Case lib occurs, either the optimum or a decrease 
in / will follow which will, by Lemma 1, be followed either by the optimum or 
by Case II in a finite number of iterations. This proves the lemma. 

4. An Example 

The following example is used by Theil and van de Panne [7]. We shah take 
advantage of their computation of Q. We first consider the problem (8), with Q 
and d as the given data, and then return to the original problem. 

We are solving the problem (8), where: 


1.043 

-0.128 

-0.516 

0.187 

-0.586 

-0.051 

-0.426' 

-0.128 

0.250 

0.064 

-0.108 

-0.078 

-0.007 

-0.457 

-0.516 

0.064 

0.379 

-0.127 

0.200 

-1.211 

0.377 

Q = 0.187 

-0.108 

-0.127 

0.256 

-0.208 

0.333 

-0.846 

-0.586 

-0.078 

0.200 

—0.208 

0.673 

0.936 

1.352 

-0.051 

-0.007 

—0.211 

0.333 

0.936 

12.363 

-1.636 

-0.426 

-0.457 

0.377 

-0.846 

1.352 

-1.636 

6.056_ 

and: 







d T = (4.560 

0.475 

-1.229 

1.981 - 

■4.119 - 

8.508 - 

8.802) 

Thus, initially, we have: 







a. B = B~ x = / 

b. y = 0 

c. z = d 

d. J2 = {1, 2, 3, 4, 5, 6, 7}. 

Iteration 1 

Selecting b T : 

(i) w — B~ x z = z = d. 

Since no columns are free, Case II applies. Since co 7 = -8.802 is the most 
negative component of w, b r = e 7 . 

Selecting b r : 

(“> 2 = Qb r = Qei : 

q T = (-0.426 -0.457 0.377 -0.846 1.352 -1.636 6.056) 
q T b r = q% = 6.056, 

( iy ) = en/q T b r = -1.45343. 

Since 0 O < 0, and y — eb r = -8e 7 > 0 for any 8 < 0, we take t 0 infinite. 
Hence Case Ha applies and 6 = 8 0 and b r = q. 
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Modifying the data : 



"i 

0 

0 

0 

0 

0 

0.07035' 


0 

1 

0 

0 

0 

0 

0.07546 


0 

0 

1 

0 

0 

0 

-0.06225 

a. BT 1 = 

0 

0 

0 

1 

0 

0 

0.13970 


0 

0 

0 

0 

1 

0 

-0.22326 


0 

0 

0 

0 

0 

1 

0.27015 


0 

0 

0 

0 

0 

0 

0.16513. 

obtained using formulas (19). 







b. y = 0 — 0oe7 








T 

y = 

= (0 

0 

0 

0 

0 

0 

1.45345). 


c. z — d — 6qQ 

z T = (3.94084 - 0.18922 -0.68106 0.75140 -2.15396 -10.88581 0) 

d. R = {1,2, 3, 4,5,6}. 

Iteration 2 

Selecting b r : 

(i) w = B~~ x z = z (since the last component of z equals 0). 

Since column 7 is the only free column, and the 7 th component of w is 0, 
Case II again applies. Since coe = —10.88581 is the most negative component of 
w, b r = * 

Selecting b r : 

(ii) q = Qb* 5* 0: 

q = (—0.16608, -0.13046, -1.10915, 0.10445, 1.30123, 11.92103, 0). 

(Hi) = q T e 6 = 11.92103, 

(iv) do = coo/qo — —0.91316. 

Since d 0 < 0, y - 6b r > 0 again for any 6 < 0, and we take to infinite. Hence 
Case Ila applies and 6 = 6 0 with b r = 


Modifying the data: 



“1 

0 

0 

0 

0 

0.01393 

0.07411 


0 

1 

0 

0 

0 

0.01094 

0.07842 


0 

0 

1 

0 

0 

0.09305 

-0.03712 

a. BT 1 = 

0 

0 

0 

1 

0 

-0.00876 

0.13733 


0 

0 

0 

0 

1 

-0.10916 

-0.25275 


0 

0 

0 

0 

0 

0.08389 

0.02266 


_0 

0 

0 

0 

0 

0 

0.16513. 

b. y T = (0 0 0 

0 

0 

0.91316 

1.70012) 
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c. z r = (3.78918 -0.30835 -1.69389 0.84678 -0.96573 0 0) 

d. R = {1,2, 3, 4, 5}. 

Iteration 3 

Selecting b r : 

(i) w = BT l z = z (again since the last two components of z are 0. This is due 
to the recurrence of Case Ila). 

Since columns 6 and 7 of B are free columns, and the corresponding compo¬ 
nents of w are 0; and since w is not yet non-negative, Case II again applies. Since 
o> 3 = —1.69389 is the most negative component of w, b T — e% > 

Selecting b r : 

(ii) q T = (-0.50494 0.08031 0.25223 -0.06461 0.23690 0 0). 

(iii) q T b* = q T e z = 0.25223. 

(iv) flo = -6.71566. = u z /q T e z 
Using formula (17) we have: 

so that Case Ila applies; 6 = 0 O ; and 6 3 = q. 


Modifying the data : 



"i 

0 

2.00191 

0 

0 

0.20021 

-0.00020" 


0 

1 - 

•0.31840 

0 

0 

-0.01869 

0.09024 


0 

0 

3.96464 

0 

0 

0.36891 

-0.14717 

U. BT 1 = 

0 

0 

0.25616 

1 

0 

0.01508 

0.12782 


0 

0 - 

0.93922 

0 

1 

-0.19655 

-0.21789 


0 

0 

0 

0 

0 

0.08389 

0.02266 


_0 

0 

0 

0 

0 

0 

0.16513 

b. y T = (0 i 

, __ 

0 

6.71566 

0 0 

1.53805 

1.45083). 



c. z T = (0.39817 0.23098 0 0.41288 0.62521 0 0). 
d- R = {1,2, 4, 5}. 

Iteration 4 

Selecting b r 

(i) w = B^z = z > 0; hence the solution has been found in three iterations. 
The fact that in all of the iterations only Case Ila occurred before the optimum 
was reached is due to the simplicity of the problem. 

The values of the functional/^) may be obtained by algorithm using formula 
(16), or the final value only may be computed directly when the opt imum fiag 
been reached. In the latter case, noting (7) we may write: 

(24) f(y) = ly T d + iy T (d + Qy) = \y T d + \ y T z - \ y T d. 



IV— 11— A METHOD OP SOLUTION FOB QUADRATIC PROBLEMS 


173 


The values of the functional for each iteration, starting with /(0) = 0 are 
0, -6.397, -11.367, and -17.055. 

We next return to the original problem which gave rise to the above example. 
This is an example of the problem (1) and (2), where 


Qa — i 


A = 


do 


T 


"6 18 
1 10 1 
8 1 17 

0 4 3 

"-1 0 

0 -1 

0 0 

0 0 

(0 0 0 0 


0 

4 

3 ,0 
11_ 

0 0 

0 0 

-1 0 

0 -1 

I 2 3). 



15 0 
1 0 4 
1 10 0 
10 5 


One initially computes 


Xo = Qo 1 ^ 


4.560\ 
0.475 \ 
-1.229 I ’ 
1.981/ 


and subsequently d = d 0 - A T x 0 , which is the d used above, and which shows 
that Xo , the absolute minimum of the functional does not satisfy the constraints. 
One then computes Q = A T Qa l A, and solves the problem as above. Qa in this 
case appears as the 4X4 matrix in the upper left of Q. 

Finally, it remains to calculate the solution x via formula (5) : 


x = xa — (Qo 1 A)y = 


0.400\ 
0.233 1 

2 / 
,0.414/ 


with functional value given by 

—a T x + ix T QoX = —ia T x 0 + y T d = —17.037. 

5. Discussion 

It appears to be well worthwhile to take advantage of the non-singularity of 
the strictly quadratic part of the functional, when such is the case, to examine 
first the point x 0 where the absolute minimum of the functional is taken on. This 
allows the transformation to the problem (8), which could hardly be simpler in 
form, and the extremely simple algorithm described Above. 

Note that (i) there is no need to seek out a ‘first feasible solution’ for the 
constraints, as required for the approaches suggested by Beale [1] and Wolfe [8], 
(ii) there is no need to consider any ‘degeneracy’ cases. 

The formulation (8), involving Lagrange multipliers as it does, is a form of 
dual problem to the original, as described by Dom [4]. It is in fact Dorn’s ‘Type 
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II’ dual, after eliminating the set of variables not constrained to be non-negative. 

As noted, the algorithm is similar to the efficient one proposed by Beale in 
1955. Perhaps the chief difference is that the initial data, Q is retained in its 
original form throughout the calculations, which are based on a ‘current basis 
matrix’, as in the modified simplex method [3] for linear programming. 

Perhaps the method here suggested is more efficient than that proposed by 
Theil and van de Panne, although a general statement to that effect is out of the 
question. Two possible objections to the Theil-van de Panne approach as com¬ 
pared with ours are that (1) they do not make efficient use of computed data, 
but compute afresh inverses of submatrices of Q 0 , and that (2) for large-order 
problems, the amount of data they compute and refer to seems to grow somewhat 
combinatorially. 

An acknowledgement is in order. Theil and van de Panne base their technique 
on some ingeniously derived rules, which were subsequently shown by J. C. G. 
Boot [2] to be derivable from the Kuhn-Tucker conditions. It was Boot’s obser¬ 
vation which provided the stimulus for the present method. 
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SEQUENTIAL PRODUCTION PLANNING OVER TIME 
AT MINIMUM COST* 

S. M. JOHNSON 
The RAND Corporation 

Production of a given commodity is to be scheduled over time to meet known 
future requirements while minimizing total costs. The costs include both storage 
and production costs as functions of time. The unit production cost is an increas¬ 
ing function of the production rate. 

Previous solutions to this problem have involved complicated iterative pro¬ 
cedures. A new approach brings out the basic principle involved and leads to a 
surprisingly simple solution. This coincides with a common-sense technique some¬ 
times used in business. 


1. Introduction 

Production planning models for determining the optimal production program 
over time of a single type item have been studied by numerous authors: Mo¬ 
digliani and Hohn [1], Hoffman and Jacobs [2], Dantzig and Johnson [3], Arrow 
and Karlin [4], Bellman [5], and others. For the case where the unit costs are either 
fixed or non-decreasing functions of the production rate, where there are storage 
costs, and where there is no cost for changing the rate of production, several 
authors have shown that this production model is equivalent to a transportation 
model. To solve such a model certain authors, Bowman [6] and Manne [7], have 
suggested use of the simplex method for transportation problems. Bishop [8], on 
the other hand, (following a similar approach used by Prager for the Caterer 
Problem) has developed a variant of the iterative simplex method to take ad¬ 
vantage of the structure found in this problem. 

This paper goes one step further by showing that the special features of the unit 
cost matrix lead to a simple, direct (non-iterative) solution. Indeed, the funda¬ 
mental principle is to satisfy in turn each requirement in due-date order by the 
cheapest available means. 


2. The Problem 

We wish to schedule the production of a given commodity over n successive 
periods of time to meet known requirements while minimizing total costs. Re¬ 
quirements of Rk units are due at the end of the k- th period (k = 1 , 2, • * • , n). 

We consider two kinds of costs: a unit production cost that is a nondecreasing 
function of the rate of production and is also a function of the period, and a unit 
storage cost that is a function of the period stored. 

The key point of departure from previous analyses is that we identify each 
unit of production with its ultimate destination or period when it is to be used. 

* Received February 1957. 
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Let 

i = the period in which the item is produced, 

j = the order of production of the item in that period. 

k = the period when the item is to be used to satisfy a requirement, 

Cijk = the total cost of producing and storing the item. 

We assume for fixed (i, k) that 

^ Cijk ^ Ci,j+ l,k ; 

i.e., the unit production costs are nondecreasing functions of the production rate 
for each period, the marginal increase being added to the cost of the next item 

produced. It is clear that the j-th unit must be produced before the (i + ll-st 
unit. ' 

We assume for fixed (i, j ) that 

(2) Cijk = Cm + Ai+i -f- Ai+i -(- • • • -j - Ak ; 

i.e., the unit storage costs depend only on the period stored and are added to 
the production costs. 

• Tl1 ? r ® ( l uirem f nt can be considered as R k unit requirements each due at 
tune k. The entire requirement schedule can be considered as a collection of R 
unit requirements ordered by their due dates. With this ordering, the optimal 
procedure is very simple to state: 

Theorem. For a set of unit costs Cat satisfying (1) and (2), and a known set of due 
dates for R units, the total cost is minimized if each unit requirement is met sequen¬ 
tially m order of its due date by assigning (producing and storing) the cheapest 
unit cost available at that stage . 

p roof. First note that by (1) this rule automatically satisfies the physical 
condition of the problem that a C must be chosen before a Ci y +1 *. 

L ? X l A fil items be produeed in the first period. By (2) the total costs are 
unaffected if we assume that the firsts of these P x units are used to meet the 
requirement Pi, and that the remainder are used for the second and later periods. 
It is clear that the first Pi units required have been assigned to production 
according to the rule stated in the theorem. 

Since all production after the first Pi units in the first period are for use in the 
second or later periods, they will all be stored for at least one period. Aecord- 
mg y, one period of storage costs will now be added to unit production costs for 
all items that might be produced in the first period after the first R l units. Let 
p = .* ™ the number of "rites produced in the first two periods after the first 
Ri units. These P , items can be arranged in increasing order of unit costs (includ- 
mg any storage), and the first R 2 units can be assumed to be used in order to 
satisfy the R 2 requirement. 

If an P 2 units have unit costs greater than the cheapest unit that could be 
produced in either the first or second period (after the first P x ), then it would be 
cheaper to have the first of the R 2 units produced by the cheapest mode. Thus 
for an optimal solution the first of the R 2 units must be produced by the cheapest 
way to produce a unit either from the first period after the first P x units or from 
the second period. Similarly we can reason that for an optimal solution the 
second unit of R 2 must be produced by the cheapest means available after the 
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first unit of Rz was produced in the cheapest way. Thus all Rz units must be 

assigned according to our rule. . , 

For all potential units of production not used in the first or second periods, 
add to production costs the cost of storage up to period 3. Let Pz £ ^3 be the 
number of units produced in the first three periods after the first Ri + R2 units 
of requirements are met. Again these units can be arranged by increasing cost 
and the first Rz units assigned to meet the next Rz units of requirements. We now 
apply the same argument to show that each of the Rz items in turn must be pro¬ 
duced by the cheapest means remaining if the solution is to be optimal. The 
same inductive argument now applies for any number of periods. 

This simple solution has many advantages. It is easy to state: merely satisfy 
each requirement sequentially in order of its due date as cheaply as possi e. 
(This coincides with the common-sense intuitive approach sometimes used in 
industry.) It is easy to construct numerically or geometrically if margmal-cost 
curves are plotted. The production schedule for the next requirement is super¬ 
imposed on the previous total production schedule. 

The optimal production for Ri, • * • , B* can be planned without knowing 
R n +i * * • . Moreover, the optimal production level for the first period can e 
carried out knowing only that the subsequent requirements beyond some given 
period n have sufficiently small upper bounds so that no production for them in 

the first period is required. . . , 

The method can easily be extended to the case where initial inventory and 
upper bounds on the production rates are present. Also if there is a time lag olq 
periods from the time a unit starts in production to the time it is completed, 
this can be taken care of with a proper definition of cost. 

It is not surprising that this simple solution will not extend to the case where 
the unit production costs are decreasing functions of the production rates, nor 
to the case where the cost of changing production rates from penod to period is 
considered, unless restrictive assumptions are made. 
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MATHEMATICAL PROGRAMMING AND SERVICE 
SCHEDULING* 

W. KAEtUSH and A. VAZSONYI 
The Ramo-Wooldridge Corporation and The University of Chicago 

1. Statement of the Problem 

Consider the problem of planning a service program over N successive unit 
time intervals, i = 1, 2, 3, • • • , N — 1, AT. We assume that some measure of 
the service level is available and that the service level requirements in each unit 
interval 

i - 1 , 2, * • • , N 

are given. A feasible program of service levels 
is one for which 

i = 1, 2, * * • , N (1) 

where x% represents the potential service level in each unit interval. 

There are many business situations where it might be necessary to keep the 
potential service level (preparedness) above requirements at least in some of the 
time intervals. For instance, a fleet of trucks (or cabs) or group of production 
machines, cannot be easily adjusted month by month and, therefore, during 
slack periods the potential service level may exceed requirements. In a manu¬ 
facturing firm, there are certain functions to be performed, such as maintenance 
or clerical work (often overhead type functions), where again the potential 
service level is expensive to change. In fact, even in the case of production 
workers, the expense of hiring, training, firing, or the contractual obligations of 
guaranteed wage agreements may make it undesirable to change the level of 
employment during slack periods. The reader will readily find further illustra¬ 
tions of the type of planning problems we are describing here. 

During the last few years, there has been a great deal of work in the theory 
of inventory control, and we find it convenient to bring our problem within the 
framework of more traditional types of inventory control problems. We con¬ 
sider service as a commodity which cannot be stored. Then we can say that our 
problem is to determine an optimum production program of a nonstorable 
(perishable) commodity under the condition that production requirements for 
this commodity in each time interval are given. In addition to (1), we shall, for 
convenience, suppose that the production is specified in the intervals i = 0 and 
i = jy + 1 immediately preceding and following the planning period, 


xo = r 0 , z N +i = ?v +1 . (2) 

This assumption simplifies the presentation but is not essential (see Section 4). 
Thus, a production program is “feasible” if it satisfies both (1) and (2). 

* Received April, 1956. 
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The requirements vector 

r = (r 0 , n , * * • , r N , r^+i) 
and a feasible production vector 

X (Xq , X\ , " * * , Xjf , Xw-^i) 

may be represented graphically as step functions, in the following manner: 

x N 




x 2 

n 




Xo = r 0 




r 2 




/- 


N 


N + 1 


It will be convenient to use geometric terminology appropriate to this type of 
geometric representation. 

Customarily, three types of cost are considered for each interval i —cost of 
inventory, cost of production, cost of change of production. In the present con¬ 
text, the first type of cost does not enter. The second type of cost will be taken 
in the form of an increasing, continuous, convex, polygon function f x) of the 
production Xi , the superscript i attached to / expressing that the cost function 
may be different for different unit intervals i. More explicitly, f %) is taken to 
have the form (omitting the index i) 


aiu + / ( 0 ) 
a 2 (u - ui) + f(ui) 9 
f(u) = \a z (u - u 2 ) +f(u 2 ), 


for 0 < u < ui 
for Ui < u < u 2 
for u 2 < u < Uz 


where 


{(ln(U — Un-i) + f(u n - 1 ), 

0 <C CLl K. d 2 ^ dn * 


for u n -1 < u 


The graph of / has the following appearance: 



I 

! 


I 
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(In general, the parameters/( 0 ) ; Uj , a } defining/« depend upon i and would be 
denoted by / ( 0 ), «,• , a,- .) The cost of production in the i th interval is then 


/%.■)• (3) 

The cost of changing the production level from to we take propor¬ 
tional to the difference (x, — »<_i), with cost coefficient a > 0 for increase in 
production and cost coefficient /? > 0 for decrease in production. More fully 


where 


h(v) = 


av y for v > 0 
—fiv, for v < 0 , 


*> 0,0 >0 



Then the cost of change of production in the i th interval is taken as 


or, alternatively, 


h(x< - Xi- 1 ) 


(4) 


a max {*,- - Xi-i , 0} + /S max - *,•, 0}. 

Observe that the change of production cost function h is assumed to be inde- 
pendent of i. 

The total cost of a feasible production program x = (zo. xi , • • * x* 
is then v , i, , *> 

# N+l 

C(x) = + 4 L,h(xi — ( 5 ) 

t's-l V * 

or, alternatively, 

X . JV’+l J V+1 

c (x) = g/ (,) fe) + a g max { Xi - , 0} + fi £ max - *<, 0} 

The problem is to determine a production program x satisfying ( 1 ) and ( 2 ) 
which provides C(x) with a minimum among all programs x satisfying ( 1 ) and 
( 2 ). This problem is a generalization of one proposed by Bellman, Glicksberg, 
and Gross in “The theory of dynamic programming as applied to a smoothing 
problem”, J. Soc. Indust. Appl. Math., Vol. 2 , pp. 82-88 (1954). In the present 

paper we shall describe an effective method for the construction of an optimal 
program x. 
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2. Construction of Optimal Program 

Let us understand henceforth that “program 55 or “vector 55 means “feasible 
program 55 or “feasible vector 55 . By an albwdble deformation of a vector 


x — (Ph , xi , • * • , rcjv”, #iv+i)j 


we shall mean a transformation of x into some vector x* in such a way that the 
cost C is not increased, i.e., 

C(x*) < C(x). 

We shall be dealing with three types of allowable transformations, to be num¬ 
bered 1 °, 2 °, 3°. The first and second will be described in this section, the re¬ 
maining one in the next. 

To describe the transformations certain notation is convenient. Given a 
vector 2 , and its graph, the symbol 

h 

will be used to denote any horizontal segment of the graph consisting of k suc¬ 
cessive intervals Xi (other than x 0 and at the same level: 

Xj = = • • • = Xj+h—i , j 7*^ 0, j Hh k — 1 JST -f* 1. 

The range of values of k is 1 < k < N. Let 

x L and x R 

denote the leftmost and rightmost intervals of /* , respectively: 

L = j, R — j + k — 1 . 

We wish to study the effect on C(x) of raising or lowering a given horizontal 
segment Ik of x. To do this we introduce the notion of critical level of x (or the 
graph of x) by which is meant any one of the levels 

Ui'\ Uz''\ • - ■ i = 1 , 2 , • • • , N 



1 

1 

1 

| 




1 

1 

1 

Critical 

1 
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The critical levels for each i, are simply the values u at which the derivative 
Df M of / w has a discontinuity. Let 

D + f, D~f 

denote, respectively, the right-hand derivative and the left-hand derivative of 
/. Except at a critical level, of course, these two derivatives are equal; between 
critical values they are constant and equal. 

With a given horizontal segment I k , we associate the quantities 

+ A* = £ D + f’\ Xi ), ~A k = Z D-f’Xx,) (6) 

X J Xj 

(summed over the xj belonging to /*). 

Observe that if a given vector x is deformed by raising one of its segments I k 
an amount Au > 0 (without crossing a critical level above /*), then the produc¬ 
tion cost term (i.e., the first term of (5)) increases by the amount 

+ A k -Au. 

If it is deformed by lowering I k an amount Au > 0, then this term decreases by 
the positive amount 

~~A k • Au 

A final definition the segment I k is a minimum segment in case the immedi¬ 
ate neighbors of I k lie strictly above I k , i.e., 

%l-i > x L and x R < x R+x . 

We are now prepared to formulate the first two allowable deformations. 

1) If I k is a minimum segment and 

< a -f- ft 

raise I k to any level not above the lowest of , z* +1 , and the critical levels 
above I k . 

2°) If I k is any horizontal segment and 

~A k > a + ft 

lower I k to any level not below the highest of the critical values below I k . 

To verify that 1°) is an allowable deformation, i.e., decreases C(x), notice that 
raising I k by an amount Au increases the first term of (5) by + A k -Au, but de¬ 
creases the second term by (a + j3) * Au (because I k is minimum). Under the 
assumption A k < a + ft the net change is a decrease, *as required. In the case 
of 2), lowering I k by Au > 0 decreases the first term by ~A k *Au and at most 
increases the second term by (a + P) * A u. Thus, the net change is a decrease. 
Observe that 2° is a strictly decreasing transformation; 1° is strictly decreasing in 
case a + 0 is not equal to any sum of N or fewer ay’s (for, then, the inequality 
in 1° is strict). J 

The program r. From the particular program 

r = (ro, ri, * * * , rjyr, r#+i) 
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we shall construct another program 

f = (r 0 , n , * * - , f N , 

by a succession of transformations of type 1°. Consider the minimum segments 
of r ordered, say, from left to right. (If r has no such segments, we take f = r; 
this occurs when r has the form 

n < n < • * ♦ < tv , ry > ry + i > > r N + 1 , j = 0, 1, 2, * • * , IV + 1.) 

Step 1: Consider the first minimum segment, say /*, and the corresponding 
value + A ,. If > a + 0, step 1 is complete and we pass on to the next min¬ 
imum segment. If + A S < a + @ raise I * to min {r^i, r R +i, h} where h is the lowest 
critical level above Having performed this deformation, now consider I t , 
the largest Ik containing I s in its new position (this will be I* itself unless I s has 
been raised to the level of one of its neighbors). If I t is minimum and + A t < a + 
P, raise I t to the minimum of its neighbors and the critical levels above it. Con¬ 
tinue in this way until the original minimum segment J, has been deformed to 
an Ik which is either not a minimum segment, or is a minimum segment with 
+ A k > a + p. This completes step 1. Step 2 is to deform the second minimum 
segment of r in the same manner. This process is applied to all the minimum 
segments of r in turn. The result is a program f such that: 

If Ik is a minimum segment of f, then 

+ A k > cc+ /?. (7) 

Theorem. The vector r is an optimal program. Furthermore r lies above every 
optimal program. Also , if a + ft is not the sum of N or fewer values a/ l} , then f 
is the unique optimal program. 

This is the main result of this paper. The details of the proof are given in the 
next section. Notice that the algorithm for constructing f is a simple one compu¬ 
tationally. (The sense in which one program x lies above another y is that Xi > 
yi , for alH = 0, 1, 2, * * • , N + 1.) 

3. Proof of Optimality 

The proof is based on two lemmas; in the first lemma we show how to deform 
any program y into one lying above f, and in the second lemma we show how 
to deform any program x lying above f into f itself. 

Lemma 1. Every vector y may be deformed into a vector x lying above f by the 
use of deformation 1° alone. 

Proof. Consider the original requirements vector r and an arbitrary (feasible) 
vector y\ the vector y lies above r. The process of deforming r into f consisted 
of a series of deformations 1° applied to minimum segments of r. Imagine re¬ 
peating the process of deforming r into f with this addition—whenever, in the 
act of raising a minimum I t of r, we encounter a segment 1/ of y , we carry this 
segment up with I t . Such a segment // is necessarily minimum, since y originally 
lay above r. Also + A/ < a + P; for, (i) + A t < a + fi and (ii) since every term 
in the sum defining + A/ occurs in the sum defining + A t , + A/ < + A t . By car- 
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rying along segments of y into this way as we deform r into f, we achieve a de¬ 
formation of y into some vector x lying above f, using a series of deformations 1°. 

Before proceeding to the next lemma, we formulate an additional allowable 
deformation. 

3°) Lower any Xi to a level not below the minimum of its neighbors, i.e., not 
below both Xi-i and Xi + i. ’ 

For, lowering x> an amount Au > 0 strictly decreases / (i) fc) in the first term 
of C{x), and does not increase h( Xi - x H ) + h(x i+1 - Xi ) in the second term. 
Notice that 3 is a strictly decreasing deformation. 

Lemma 2. Every vector x which lies above r may be transformed into f by use of 
the deformations 2° and 3° alone. 

Proof. First we prove: 

x can be deformed into a vector z lying above f such that z coincides with r 
at the endvalues fo = r 0 , f N+l = 7>+i and at the minimum segments 7* of f. 
Coincidence at the end values holds by requirement (2) of feasibility. Now, con¬ 
sider any minimum segment I* of f, and those intervals x { of x lying above 7* . 
Select a lowest such Xi , say x,. By use of deformation 3° every over 7* may 
be brought to the level x ,, producing thereby a segment 7*' belonging to (the 
deformed) x. If 7*' = 7* , the argument is complete; if I k ' is strictly above 7* 
then ’ 

~A k r > + A k , 

which follows from the definition of these quantities. From (7) 

A k > a /S 

Hence deformation 2° is available for lowering 7/ into coincidence with 7* as 
required. ’ 

The graph of f can be decomposed into successive parts 

r o, 7i, • • • , f i, 

^*i J ^**1+1 y * * * j ^t*2 

y Fip+1 y ‘ * * , TV-fl 

such that each part is monotonic (increasing or decreasing) and has its mini¬ 
mum end value coinciding with r 0 , r N+l , or an interval u belonging to a mini , 
mum 7* of f. Consider the corresponding parts of z, namely, 

^*0 j &1 j ? 2 i| 

2*1 y y * * * y 

Zip y Zip+l y * * * , TV+l 

These parts are not necessarily monotonic, but by the preceding paragraph the 
minimum of each such part of z coincides with the minimum end value of the 
corresponding part of r. To be definite, consider an increasing part of f, say, 

fj < f i+ i < • • • < f k 
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Then by the preceding paragraph 


Z S = f y , 

and 

Zj = min {zj , z s+1 

the latter being a consequence of z lying above f. By successive applications of 
deformation 3° we may bring into coincidence with f j+1 , z y+2 , into coinci¬ 
dence with fj + 2 , — , 2 ;* into coincidence with f* . Dealing with the various 
parts of z in this way we ultimately deform all of z into f. The lemma is thereby 
established. 

The main theorem is now easily argued. By Lemmas 1 and 2, any feasible x 
can be allowably deformed into f, i.e., 

C(x) > C(f). 

Hence f is an optimal program. To show that f lies above every optimal program, 
consider an optimal program y, 

C{y) = C(f), 

which has at least one component, say ?/,-, lying strictly above f*. By Lemma 

1 we may suppose that y has already been deformed to lie above f. By Lemma 

2 we may deform y into f by the application of 2° and 3° only. Since y t > f € , 
at least one such application is required, and since 2° and 3° are strictly decreasing 
(as noted earlier) we would have 

C(y) < C(f). 

The contradiction establishes the second statement of the theorem. The last 
statement follows from the fact that, under the given assumption on a + £, 
deformation 1° is also strictly decreasing—thus any x different from f will re¬ 
quire at least one deformation 1°, 2°, or 3° to be brought into coincidence with 
f, so that C(x) > C(f). 


4. Modifications of the problem 

It was pointed out earlier that the terminal conditions (2) were not essential 
to the applicability of the method used in this paper. More precisely, one may 
consider the class of programs 

X = (xi,x 2 , • • • ,X N ) 

satisfying (1), and search for one such that minimizes the cost function 

t-i 1=2 

This problem requires a slight modification of the foregoing procedure. In addi¬ 
tion to segments I k of x, for which both the leftmost interval x L is not Xi and the 
rightmost interval is not x N , we distinguish initial segments H k with x L coinci- 
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dent with xi and terminal segments J k with x R coincident with x N . Deformations 
1 and 2 apply to segments H k with a + ,3 replaced by «, and to segments J k 
with a + p replaced by ft; deformation 3° remains unchanged. (Here, H k is under¬ 
stood to be a minimum segment in case x R < x R+l , and J k in case > x L .) 
The optimal program f is constructed by applying 1° to the successive minimum 
h of r, as^ described in Section 2 and, having performed these deformations, 
applying 1 to the initial minimum segment H k and terminal minimum segment 
J k (if any). 

Perhaps the most realistic problem is the one that retains the first of the two 
conditions (2). Namely—among all programs 

* (®o j > £2 , • • • , Xs—i, x/f) 

satisfying (1) and 


Xo = n, 

find one which minimizes the cost function 

IT It 

+ z h(x t - x^). 

X t-l 

The method of the preceding paragraph solves this problem with the modifica- 
tion that the type of segment Hk is not introduced. 

The results of this paper may be generalized in another direction. It was as¬ 
sumed that the production cost functions f (i) were quasi-linear functions i.e. 
made up of linear parts. Essentially the same procedure produces an optimal 
program r m the more general case where the cost functions are arbitrary in¬ 
creasing, convex, continuous functions made up of a finite number of continu¬ 
ously differentiable parts. 
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AN ANALYTIC SOLUTION OF THE 
WAREHOUSE PROBLEM* 

STUART E. DREYFUS 
The RAND Corporation 

1. Introduction 

In a recent paper, [1], Bellman uses dynamic programming to establish a 
computational algorithm for the solution of the “warehouse” problem. [2]. The 
present paper also employs the dynamic-programming approach and shows that 
the structure of the solution can be determined analytically, with numerical 
results easily obtained via recursive formulas. 

2. The Warehouse Problem 

The problem considered here can be formulated as follows. Given a warehouse 
of fixed capacity, B, and an initial stock, v, of a certain product, subject to 
known seasonal fluctuation in selling price and cost, what is the optimal pattern 
of purchasing, storage, and sales? 

3. Mathematical Formulation 

Let the process continue for N periods. When i periods remain in the process, 
let 


Ci = cost per unit 
Pi = selling price per unit 
Xi = amount bought 
yi = amount sold 

fi(v) — the profit obtained over the remaining i periods, where initial stock is 
v, and an optimal policy is used. 

As discussed in Bellman’s paper [1], application of the “principle of opti¬ 
mality” of dynamic programming yields the functional equation 

f N (v) = Max \pnVn - c N x N + f N -i(v + x N — Vn)] ( 3 . 1 ) 

xjv.vjy 

where the maximum is over the region 


(a) 

VII 

s«s 


(b) 

v + x N - y N g B 

(3.2) 

(c) 

Xu , ytr S 0 



The analytic structure of the solution will be deduced from the above equation. 


Received February 1957. 
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4. A Transformation 

Let w equal the level of inventory attained at the end of the period under in¬ 
vestigation. The choice of u will be called the “policy” for that period and we 
wish to determine the optimal sequence wi, • • • , u K . 

°! ea 5 “ “ v + x - y and is less than or equal to B. Let us return to equation 
(d.l). With the new notation this becomes 


fx(v) = Max [p N y K - Cn x n + f N ^{u)} 

OgwgjB 

vnS v 

XN'VN^Q 


= Max 


Max (pnVn — Ctt xn) + fy-i(u) 

V+X N -7j N ^ U 

\~. X N<VN 0 


Mm [<jW(w, v ) + / w _xCw)J 


(4.1) 

(4.2) 

(4.3) 


where 


v) — Max (pyt/N — CtfXy) 

v+xy-~Vjtf~*u 
VN£*v 
X N>VN> 0 


(4.4) 


In the determination of <t> N (u, v) we are faced with the maximization of a linear 

unction over the points on a straight line, so that we need only investigate the 
end points. 

When 0 g u g v, the two points under consideration are x N = 0, y„ = v - u 
and x N — u } y N = v. In this region 


4>n(u } v) — Max [py{v — u), p N v — cyu\. 



(4.5) 


Fig. 1 

Case 1: c N > p N 
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Arguing similarly, for v ^ u S B 

<f>N(u, v ) = Max [— c N {u - v), PnV - c N u] (4.6) 

For fixed v, 0 S v ^ B, the result can be shown geometrically in two cases: 

Case 1: 

Cn > Pk 

Case 2: 

Pk > Cn 

Having established the nature of the function </>k(u, v), we shall proceed to the 
proof of a theorem. 

5. Theorem 

The structure of the function/^(v) defined in equation (4.3) is not immediately 
obvious. However, the following surprising property holds: 

Theorem 1: The function fN(v) is linear in v, the coefficients being functions of 
Pi , * • * , Pk and Ci , • * * , Ck , 

(1) U(v) = Kn(Pi , Ih, Pk ; Cl, c 2 , Cn) + L K (pi, Pt , * * * , Pk) 

Cl , c% , * * * , Cn)v, 

Furthermore, the optimal policy, u, is independent of v, the initial stock, and de¬ 
pends only upon the selling prices and costs. 

6. Proof of Theorem 1 

The proof is by induction. Clearly fi(v) = piv since fo(u), the zero stage re¬ 
turn, is identically zero. Assume that fK-i{v) = K N -1 + L N -iV where K N -i and 
L n -i are determined by the (N — 1) prices and costs, {pi , d), i = 1, 2, * • • , 
N — 1. We shall show that /y(t>) has the same form where the coefficients now 



Fig. 2 

Case 2: px > cm 
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depend upon the sequence [ Pi ,*},* = 1, 2, ■ • • , A. As in case 1, let Cy be 
greater than p N . Due to our hypothesis concerning the linearity of f N -i(u) 
the maximum must occur at one of three points: u = 0, u — v, or u — B. Thus, 

h(v) = Max [MO, v ) + / jv - i ( 0 ), <f> N (v, v) + fy- i(ft), MB, v) + fx-\(B)] (6.1) 

= Max [p N v + Kn- i, ifjir-i + Ly^v, —c N (B — v) + K N ^ + Ly^B] (6.2) 

Since the third quantity in (6.2) is greater than the first two if and only if c* < 
Iw-i we have established the condition for a choice of u = B. The second quan¬ 
tity is maximum when p„ < Ly_i < c K and the first is largest when L H -\ < p N . 
In all three cases it should be noted that the maximizing u is independent of v. 
If u is taken equal to B, th e n 

} n ( v ) = (fiTjr -1 + L n ^B — c n B) -f c N v. (6.3) 

Hence Ky = Ky-i + Lk-iB — c N B and L N = cy . If u = v 


fy(v) — K n _i Ln—iV 

whence Ky = Ky- 1 and Ly = Ly_ x . Finally u = 0 Wd s to 


(6.4) 


— Ky -1 + pifV. ( 6 . 5 ) 

Hence Ky = , L n = p K . In each ca,sef N (v) is a linear function of » with the 

new coefficients depending upon pi, • • • , p N and Cl , • • • , c s . 

Case 2, p N > c N , remains to be considered. Since both Mu, ft) and f y ^(u) are 
hnear we must investigate only two points, u = 0 and u = B. Here 

Mv) = Max [MO, v) + f y _,( 0), MB, v) + f^B)} (6.6) 

= Max \p„v + Ki,^ , p N v - c n B + Ky^ + Ly^B). (6.7) 


We have a maximum at u = 0 if M, < Clf an d at « 
equality is true. In these cases 


B if the reverse in- 


fy(v) = Kn-i + p N v, (6.8) 

if m — 0, with K y = Kk-i , L y = p N . On the other hand, 

fy(v) = (Ky-! -f Ly- X B — CyB) + PkV (6.9) 

if ft = B, with Kk = K n _i + L n _iB - CyB, Ly = p N . This completes the proof. 


< • mscussion 

Let us now consider the economic interpretation of the problem, and investi¬ 
gate our mathematical results. Apparently, if c w > p N , we have three altern¬ 
atives, dependent upon other parameters. Equations (6.3)-(6.5) have the 
following significance: (6.3) represents a purchase of enough goods to fill the 
warehouse and the current cost of this decision, MB, ft), is c N (B - v). Equation 

(6.4) corresponds to doing nothing, with associated current cost of zero. Finally, 

(6.5) arises from selling all ft items with which the period was entered and de¬ 
rives an immediate return of p N v. Turning to equation (6.8), where py > c N , 
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we have a slightly different interpretation. Here a policy dictating a final level of 
B means the sale of v and purchase of B, with associated cost p N v — c N B, A 
choice of u = 0, as above, means the sale of the entire stock, v, and returns 
PnV. In all, we then have four distinct policies: sell, buy, sell and buy, and do 
nothing and in each case the policy is pursued up to the constraint of warehouse 
capacity or stock on hand. 


8. A Numerical Example 

Recalling the definition of pi and c* to be costs when i periods remain, let us 
consider the following 10 period process. 


Cio = 8 , Cg = 8 , c 8 = 2 , 

CO 

II 

$ 

II 


C 5 3j 

Ci 3, 

c z = 2, c 2 = 5, cj 

= 3 

Pio = 3, jh = 6, pa = 7, 

Pi = 1, 

Pe = 4 , 


Ps = 5, p 4 = 5, Pz = 1, P2 = 3, Pi = 2* 

We desire K10 , L i0 , and wio, the return coefficients and policy when ten periods 
remain, i.e. at the beginning of the process. Since fi(v) = piv, we conclude that 

Ki = 0, Li = pi , ui = 0 so 




Kx = 0 

U = 2 

Wi = 0 

(8.1) 

We note now that c* > p* so we refer to equations (6.3)-(6.5). Since Li < p 2 , 
equation (6.5) is applicable and 

& 

11 

0 

Lt = 3 

Un = 0. 

(8.2) 

For the third from last period, c 3 > pz and L 2 i 

> c 3 yields, from (6.3) 


K t = B 

Lz = 2 

u z = B 

(8.3) 

Continuing this process, 




Ki = B 

Li — 0 

U 4 = 0 


K & = 3JS 

Li = 5 

Uz = B 


3 

11 

. <0 

Li = 4 

Uz = B 


K 7 = 5 B 

I 7 = 3 

u 7 = B 

(8.4) 

K s = 6 B 

Li =7 

us = B 


Ki = 6 B 

Li =7 

Uq = V 


K w = QB 

La — 7 

Uia = v. 



Our conclusions are, for this numerical example, that an optimal policy leads 
to a profit of 6 B + 7v, where v is the stock at the beginning of the 10-stage 
process and B the warehouse capacity and that the optimal policy requires no 
action during the first two periods, sell v and buy B during period 3, remain full 
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during the fourth and fifth periods, sell B and buy B during period 6, sell B 
during period 7, buy B during the 8th period, sell out during period 9. 

9. Conclusions 

We have established the following results: 

1. The optimal N -stage return is a linear function of initial stock with coeffi¬ 
cients dependent upon the costs and selling prices. 

2. The optimal policy at any stage is independent of initial stock at that stage. 

3. The optimal policy will always have the following structure: Do nothing 
for the first k stages (fc may equal 0), and oscillate between a full and empty 
warehouse condition for the remainder of the process. 

4. The policy and return can be calculated trivially using simple recurrence 
relations for the coefficients of the linear return function. 
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DERIVATION OF A LINEAR DECISION RULE FOR 
PRODUCTION AND EMPLOYMENT 

CHARLES C. HOLT, FRANCO MODIGLIANI, and JOHN F. MUTH* 

Graduate School of Industrial Administration , Carnegie Institute of Technology 

An application of linear decision rules to production and employment schedul¬ 
ing was described in the last issue of this journal [2]. The hypothetical perform¬ 
ance of these rules represented a significant improvement over the actual com¬ 
pany performance as measured by independent cost estimates and other mana¬ 
gerial measures of efficiency. The quadratic cost function which was used should 
be applicable to production and employment scheduling decisions in many other 
situations. Also the general approach of approximating decision criteria with 
quadratic functions and obtaining linear decision rules can usefully be extended 
to many decision problems. 

In the present paper we will demonstrate a) how optimal (i.e., minimum ex¬ 
pected cost) decision rules may be derived for a quadratic cost function involv¬ 
ing inventory, overtime, and employment costs, and b) how the numerical 
coefficients of the rules may be computed for any set of cost parameters. 

1. The Decision Problem 

The costs to be minimized are represented by the following function of work 
force, W t ; aggregate production, P t ; net inventory, h ; and ordered shipments, 
Of (where the subscript, t, designates the time period): 

, N c N = Z [(Ci - c s )w t + Ci(w t - w t -i - Cu) 2 + c 3 (p t - CiWy 

( 1 . 1 ) ‘- 1 

+ C 6 P« + CnP t W t + C,(I t - C 8 - C»O t f + C 13 ] 

where, by definition, the excess of production over orders affects net inventory 
as follows: 

(1.2) Pt -Ot = I t - I t - 1 1 * = 1, 2, * • • , AT 

* We gratefully acknowledge the criticism and help of Messrs. Herbert A. Simon and 
Peter R. Winters. 

Research undertaken for the project, Planning and Control of Industrial Operations , 
under contract with the Office of Naval Research. Reproduction in whole or in part is per¬ 
mitted for any purpose of the United States Government. 

1 We have not found it necessary to place bounds on the variables, such as non-negativity 
restraints on production, because for the type of problem with which we have been dealing 
the unconstrained solution can be expected to satisfy such constraints with but rare excep¬ 
tions. Our general approach is to view certain actions, for example negative production and 
overcapacity operations, as being undesirable because they are expensive. In minimizing 
costs, these actions are automatically avoided so there is little or no need to place bounds 
on the solutions. 

The existence of such a solution requires the satisfaction of the second order condition 
that the cost function be a positive definite quadratic form. We believe that this condition 
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The cost function above is somewhat more general than that presented in 
Equation 7 of [2], but the only important change is the recognition of a possible 
additional interaction between the size of the work force and the production 
level, i.e., CnP t W t . The term C 7 (J< — C s — CiOt) 1 represents inventory carry¬ 
ing and run-out costs, the minimum of which varies with the rate of incoming 
orders. The terms 


(Ci - C*)W t + Cz(P t - C,W t f + CiP t + CnPiW, 


approximate the costs of regular and overtime hours for specified levels of the 
work force and rates of production. 2 Costs associated with hiring and firing, 
i.e., changes in the work force, are represented by the term Cz(W t — W t ,.i — Cn) 2 . s 
The constant cost term, Cu , is not changed by the scheduling decisions, and 
hence is irrelevant in their making. 

The problem we then face is the following: To choose a decision rule (strategy) 
for making production and labor force decisions in successive time periods that 
will minimize the expected value of total costs over a large number of periods. 
Since costs are influenced by the interaction between current actions and future 
orders, forecasts of the future are indispensable even though such forecasts are 
subject to errors. The passage of time makes new information available which 
allows improvements in the accuracy of the forecasts. The design of an optimal 
decision rule should take these considerations into account. 

In general, however, future orders are uncertain; that is to say, information 
about orders in each future period may be cast in the form of a probability 
distribution. H. A. Simon [6] has proved that the optimal solution for this un¬ 
certainty case can be obtained directly from the solution of the certainty case. 4 
For this purpose we simply replace each period’s probability distribution of 
orders with its mathematical expectation (the average of orders weighted by the 
probability distribution) and then proceed as though these expected values 
were certain. This procedure will yield a decision that is optimal for the first 
period. When new information is available at the end of the period, the forecasts 
s ould be revised and the process repeated. “Certainty equivalence” is extremely 
important, because it enables us to obtain a simple and tractable solution for 


normally will be met by the cost structures encountered in practice since in general costs 

z £zz £«“ i : v 0 £* bI ' <o : * conki "* io “ u, “ ) u 

o < r<°r c &t ri:: T z-T mimm “ m . exu,ts u «d <?,««positive &Q d 

«c7t~m^r: dlt r ar r uffi r nt - but they •» 8tron * w 

duct^n^ro^on^peri^d^ranother^lMa^ 6 fZ*** ^ 

costs of run-outs th ? cora P onen t of cost unchanged. The 

ating for'^oiylenetlf of in ^® n * 0r 7 P(®vent the cumulative of production from devi- 

.r:r y / e “ gtb • me (° m the cumuIa ^e of orders received. 

laying off iTorver 1 ^'^ ^! 10 in the costs of hiring and 

expression is expanded, we find that only^hTterm 0 ! ioci ’ lionfl ’ If thi f 0<M,t 

play between the eost n j wm, Zb n {W* — W^t), represent# m inter- 

»«y^iod.,«h. raptM ,i 4 

«. ° b, “” d ky H. Th.U in for tb. m. to whioh InM 
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the general uncertainty decision problem. It should be noted, however, that the 
proof of certainty equivalence depends critically on the decision criterion function 
being a quadratic form. This is one reason for interest in quadratic criteria. 

Because of the certainty equivalence property we can now re-state our problem 
as the following simpler one: To minim ize (1.1), subject to the relations (1.2), 
with respect to the decision variables (Wi , W 2 , • • •) and (Pi, Pi, ■•■) for any 
given initial conditions (inventory and work force) and an arbitrary “known” 
pattern of future orders. 

The reader who is not interested in the derivation of the decision rules might 
proceed to Section 4, which presents a self-contained step by step computational 
procedure. 


2. Derivation of the Conditions for Minimum Cost 

The first order conditions for minimum cost where future orders are given 
may be obtained by equating to zero the partial derivatives of cost, C N , with 
respect to each independent decision variable. In stating these first order con¬ 
ditions it is convenient to make use of the following notation, which is illustrated 
with the variable, W, for expressing the differences between the magnitude of a 
variable in successive time periods: 


A W t = W t+ i — W t 

A 2 W t = AW t +i - AW t = W t+ 2 - 2W t+ i + W t 

(2.1) A Z W t = W j-i-3 — 3W t+2 + 3Wt+ 1 — Wt 

t£w% = Wt +4 ~~ 4TF t+z + 6Wt+% — 4fflt+i + Wt 


Differentiating C» 
noting that 

( 2 . 2 ) 

and 

(2.3) 

we obtain: 


(1.1), with respect to W r (r = 

dWj-i _flif< = r + l 
dW r \ 0 otherwise 

dW t _ f 1 if t = r 
dW T \ 0 otherwise, 


1 , 2 , 


N — 1), and 


( 24 ) Wr = Cl ~ Cs + 2 ° 2(AWr - 1 - Cn) - * - Cu) 


2 CiCiiPr - ClWr) +CnPr = 0 


r = 1,2, •••,# - 1 


Solving (2.4) for P, we obtain: 


(2.5) 


Pr = y?~ CuA^Wr-i + CitWr 

I'll 

= §5 - CuW r+ i + CnWr - C u W r _i 

Ol4 


r= 1,2, 1 
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where we have defined 


Cio aft-C, 

Cl4 — 2C 3 C 4 — C 12 



r _2C 3 C 4 2 

tie = 

O 14 


O23 = (^16 4” 2(7x5 . 

Thus we find that the production rate of each period is a linear function of the 
size of the work force in the same and adjacent periods. If we knew the work 
force decisions, we could readily determine the production decisions. 

Since the inventory holding and runout costs depend on the inventory level 
which, in turn, depends on the cumulative production of all previous periods, 
if we take the partial derivatives of total cost, Cn , with respect to production 
rates as the second set of decision variables, we obtain a very complicated expres¬ 
sion. This may be avoided by considering inventory as the second decision 
variable instead of production. The production rate for each period would then 
be uniquely determined through (1.2). Therefore we differentiate the cost func¬ 
tion with respect to the inventory in each period and equate to zero in order to 
compete the first order conditions for minimum cost. Using the production- 
inventory relation (1.2) we note that: 


( 2 . 6 ) 


dl r dl r < ‘ 0 ‘ + J ‘ 7 ‘ _1 ' ) 

f 1 if t = r 
■j —1 if t = r + 1 
I 0 otherwise 


dU _ dlt-i 

dlr dlr 


Hence differentiating C„ , (1.1), with respect to I r (r = 1 , 2, • • • , N - 1) and 
setting the derivatives equal to zero we obtain 

dC 

Q[ r ~ 2iC*(Pr — CtWr) — 2C'*(P r+ i ~ CiW T + 1 ) + Cj — Cs + CnW r 

(2.7) 

-CnWm + 2 C 7 (Ir - Ca - C 9 0 r ) = 0 


Solving for inventory we obtain 


r = 1,2, - -JV - 1 


(2.8) I r = g AP r _ AWt + C s + C»O r 

We now use this equation to substitute for I r (r = 1, 2, 
equations (1.2) and thus eliminate the inventory variable. 


r = 1,2, 1 

•••,#— 1) in the 
By this substitution 
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we obtain equations in the unknowns, production and employment, as shown 
in (2.9) below. It will be noted that the first period (r = 1) must be treated 
differently from the others for Iq , the initial inventory, is not an u nkn own 
decision variable, but a known initial condition. 

P 1 -Oi = h-Io = ^ APi - ^AWi + 0 8 + c,0i - Jo 


(2,9) Pr ~ Or = Alr-l = g? A^r-l - ^ A^-l + C, AO,_l 

r = 2, 3, * * *, N — 1 

Now using the relation between production and size of work force that has 
been derived in (2.5), we can eliminate production from the above equations, 
obtaining: 

§2 - Cu AV„ + C u Wi - Ox 

Ol4 

= -CitA’Wo + CisATFi + 0 8 + C,Oi - Jo 


( 2 . 10 ) 


^2 - OxsAV,-! + OxsTFr - 0 T 
Cl4 

= —Ol 7 A^Tf^r—2 -f- OlsA 2 TFr—1 H“ CfAOr —1 

r = 2,3, ■■■,N - 1 


where we define 

„ _ 0 3 0x6 

Cl7_ 0 7 

„ _o 3 o M _^ i 

Cl8_ 0 7 2Cj' 

Equations (2.10) constitute a set of simultaneous linear relations in the un¬ 
known employment levels for the various periods. Expanding the differences by 
using (2.1) and collecting the unkn owns on the left, we can rewrite this system 
of equations as follows: 

Ox, Wi - C„W S + CnWi 

= (1 + C»)0l + (Oxe + Ol7)PTo + Os — — Jo 

—CsxTFx + OsxTFx - C^Wz + 0x7^4 

= —O 9 O 1 + (1 -I - O#)0j O 17 Wq -p~ 

w« 

CvWr-i - CnWr-1 + CnWr ~ Ojl TPr+1 + Ol 7 TF r+S 
= — C»0r—1 + (1 + Ca)0r 

r = S,4, --,N - 1 


( 2 . 11 ) 
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where we define 


Cv> — Ci e + Cis + 2Cis + 3Cit 
Cm = Cm + 3Cn + Cm 
Cn = Cia + 4(717 + Ci8 


Cn = Cu + 2Cia ■+■ 2Cu + 6Cn 


This system has two more unknown variables, N + 1, than equations, N — 1 j 
this deficiency could be remedied by supplying terminal conditions and wri ting 
two more equations. Rather than do this, however, we let N approach infinity 
so that the terminal conditions have a negligible influence on the employment 
(and production) of the first few periods. 

The structure of this infinite set of linear simultaneous equations is most 
easily seen when written in the matrix form: 


( 2 . 12 ) 


C u 

-Cn 

Cn 



-C n 

Cn 

— C21 

Cn 


Cn 

-Cn 

C22 

-Cn 

Cvt 


Cn 

-Cn 

Cn 

-Cn 


Cvt —Cn 



Cn -C n Cn 


Wi 


(1 + C.)Oi + (Cu + CM + Ci 

Cn T p 
7T "" *0 

VH 

Wt 


—C 9 O 1 + (1 + C%)0t — Cn Wo 

Cu 

Cu 

w t 


~~CgO% + (1 + Cg)0$ 

_ Cm 

Cu 

w t 


-CgO Z + (1 + C t ) 0 4 

_ Cut ! 

Cu 

Wr 


— CgOr^i + (1 + C$)Of 

_Cm 

TT* 

O 14 ? 

ti • j 


* * * 

j 


SET 01 e<1Uatiom be ** ™known JP, in which W8 are 
We can now summarise the results of this section. Pot the quadratic cost 
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function in the decision variables, W r and P T {r = 1, 2, * * •) and “known” future 
orders, we can obtain from the first order conditions for minimum costs a solu¬ 
tion in which a) the work force decisions are functions of future orders and initial 
conditions, i.e., the solution of equations (2.11), and b) the production level 
decisions are functions of the work force decisions using equations (2.5). 

Because the original cost function is quadratic, when we differentiate to 
obtain the first order conditions for minimum cost, the functions which are 
obtained are linear. The relative ease with which such linear equation systems 
may be solved constitutes an important reason for interest in quadratic decision 
criteria. 

In the next section we consider means of obtaining for the first period a solution 
of the above conditions for minimum cost. 

3. Solution of the Recurrence Relations 

A number of techniques are available for the solution of (2.11), even though 
it is an infinite system in the unknowns W\ , W% , et cetera. We shall employ 
here the one that appears to be the most simple and direct. 5 A solution for all 
the P’s and IT’s is not required since actions will be taken on only the first few 
steps of the plan. We are primarily interested in solving the set of equations for 
the immediate actions, Pi and Wi . Expressions for determining their values 
will then constitute the desired decision rules. 

From the system (2.11) we may obtain a single equation by multiplying each 
equation by the expression X r_1 , where X is a certain variable (which may take 
on complex values) and r is the number of the equation. Thus the first equation 
is multiplied by unity (X°), the second is multiplied by X, the third by X, and 
so on. Adding the resulting system of equations, we obtain: 

(C19W1 -C20W2 + CnWz) + X( — CnTPi + C22W2 - CnW z + CyiWa) 


+ E X r-1 (Ci7 Wr -2 - C 21 Wr-l +CnWr- Cn W r +1 + C 17 Wr+i) 

r-3 


(3.1) = (1 + GOOl + £ -aO r -i + (1 + C,)0J + Cs-I 0 

r=2 

+ (Cu + C 17 )Wo - XCuWo 

t/14 r-1 


By rearranging terms and noting that, 




i 


1 - X 


we have: 

8 Another method, which involves the inversion of the matrix of coefficients in equation 
(2.12), was devised by Modigliani [31. His method allows one to find all the entries of the 
inverse matrix, and is directly applicable if the coefficients are symmetric about the main 
diagonal, except for the “corners.” The method employed here was developed in a some¬ 
what more general form by Muth [4]. 
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(C 17 X- 2 - cw 1 + C« - C*iX + Cn X 2 ) (,) 

(S2) + [(C “ ~ + C »^ _1 ~ - [CCW ~ C«) + CirX -1 ]TFj 

CO 

= Z X-[l + <7,(1 - X)]O r + [Cn + Cu(l - X)]TF 0 - J 0 + C 8 


C M (1 - 

Tto equation holds for off values of X for which it. components conver, 

the J ’•+ • ’, /,? aU bounded >is sufficient for convergence that X lie insi 

the umt circle of the complex plane, excluding the origin. That is, 

@- 3 ) 0 < | X | < 1. 

** ^ iC 0 ^ ar ’ We u wjhooee values ofX satisfying (3.3) such that the fe 
term of (3.2) vanishes. Smce the series, Z?-i X^IF,, converges, the first ter 
will vanish if the polynomial coefficient is equal to zero, i.e. if 

(3-4) Cuk~ 2 - CaX -1 + C« - C 21 x + CitX s = 0. 

^?fi X ^ SatiSfieS , (3 ; 4) ’ “ the result of sy^etiy 1/Xx does also. Hence, if v 
can find any soiution, not zero or unity, we can find a solution that satisfii 

rfxr^r? T f 1 Sh ° W kter that there are two and only two valu< 

eqimtJn(3 X 4) and ^ ^ restrictions < 3 - 3 ) 88 as the auxiliar 

Inasmuch as X x and X, are roots of the auxiliary equation, we have 

(3-5) Cm — Cm + Caxr 1 - c I7 xr 2 = c M - c a x f + C 17 x. 2 , < - 1 , s 

Substituting each of these roots into (3.2), and using the relation (3.5), we obtai 

the following two equations in the two unknowns Wi and IF*: 

(C » ~ CjlX ' + - [(C*o - C sl ) + Ci 7 X< _1 ]IFj 

(3.6) = [1 + C,(l - X,)] (ZxrU) + [C„ + C 17 (l - Xf)]TFo - 7 0 

i fi Cio 1 . ^ . 

+ 8 Cm 1 - X,- * ~ J 

Using any of the numerous methods available for solving such small systems oi 

- Ca S 111611 obtain tbe decision rules for IFi and IF,. One 
method is illustrated in Section 4. 

onriZTt^t 16 ^ 1 and W * from (3 - 6) we can ( 2 -5) to determine the 
optunal rate of production, Pi. Planned levels of the labor force and rates oi 

produc ion for periods further into the future (i.e„ IF,, •.. Zd p 7 TcZn 
pro aye calculated most efficiently by successive application of the above 
^sion rales for IF, and Pi togeffier with the inventmy-production^ralZl! 
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The Roots of the Auxiliary Equation. We will examine next the problem of 
finding the roots to the auxiliary equation (3.4) that also satisfy the conditions 
(3.3). Because of the symmetry of the coefficients of this equation, the problem of 
finding the roots may be broken down into that of deter minin g the roots of two 
quadratic equations. We first make a change in variables; let: 

(3.7) * = X - 2 + 1/X = (1 — X) 2 /\. 

Then (3.4) may be reduced to 

(3.8) Cns 2 — (Cm + Cm )s + Cm = 0 

since from ( 2 . 11 ), C 22 = Cm + 2(Cm + Cm ) + QCn and C 21 = (Cm + Cm) + 4 Cm * 
The roots of (3.8) are 

( 3 . 9 ) Sj — [(Cm + Cm) V(Cm + Cm ) 2 — ^CuCn] 

17 

for j = 1 and 2 , respectively. 

Secondly, we have the quadratic equations for X from (3.7): 

(3.10) X 2 - (2 + 8j)\ + 1=0 3 = 1, 2 

whose roots are 

(3.11a) ^ _ [M( 2 + *y) “ V s i(4 + *i)l» i = i = 1> 2 

(3.11b) X " |l[( 2 + *y) + V*y (4 + *;)], i = 2 + j = 3, 4 

If the roots Sj are complex, we can write the radical V$;(4 + Sj) directly in a 
form that involves real coefficients. Let x an d y be th e real and imaginary parts, 
respectively, of *,(4 + *y) and let r = V * 2 + 2 / 2 * It is well-known (see, for 
example, [ 1 ]) that 

(3.12) V«j(4 + s 3 ) = ^jg [Vr + x ± tVr - *]• 

We will now list two important properties of the roots to the auxiliary equa¬ 
tion (3.4). First, the four roots are either all real or all complex. Second, exactly 
two of these roots (Xi and X 2 ) have moduli less than one, while the moduli of 
the other two (X 3 and X 4 ) exceed one since the parameters, Ci# , Cn and (C 15 + C is) 
all have the sign, which in turn follows from the conditions of footnote 1 . 
Furthermore, the roots are distinct except for the “hairline” case (C 15 + Cm ) 2 = 


it is computationally unstable (i.e., round-off errors eventually grow without bound). 
Techniques to impose computational stability increase the number of operations, and re¬ 
quire some degree of mathematical sophistication. One such technique [3] permits the com¬ 
putation of any desired entries of the inverse of the matrix in (2.12). 

7 This follows immediately from equations (3.9) and (3.11). If ei ther s,- is real (and. hence 
positive), so will be the other; therefore the radicals Vs;(4 + *y) will both be real. There¬ 
fore, if either s,- is real, so will all the X< . A similar argument holds if either s# is complex, 
and this exhausts the possibilities. 
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4<W Which can always be avoided simply by carrying estimates of thes 
cost coefficients to more significant figures. 

Consequently we know that there are always two (and only two) roots whic 
satisfy the auxiliary equation as well as the condition 0 < | X | < 1. Furthe] 
m< ™’ th „ eSe ' <relevant ” roots will always be X x and X 2 , given by equation (3.11a; 

Ike Reduced System of Equations. Having the two allowable roots of th 
auxiliaiy equation, we are in a position to solve the reduced system (3.6) fo 
the optimal level of the labor force, W x , given a forecast of orders and the initia 
conditions of the system. The procedure outlined above is quite straightforwan 
if the roots of the auxiliary equation are real, since a unique solution exists. 
If, on the other hand, the roots are complex, the previous results can still be cas 
into a srniple computational form. Under these conditions, the second equatioi 
of (3.6) is the complex conjugate of the first. Since an equality implies that thi 
real and imaginary parts of the equation must each hold independently of th< 
other, either one of the equations would yield the same system of two lineal 
equations having only real coefficients. 


4. Computational Procedure for Obtaining the Decision Rules 

We shah now illustrate how the method outlined above may be applied tc 
actual computations. We will take as the first illustration the specific cost func¬ 
tion discussed in [2]; in this application the roots of the auxiliary equation (3.4) 
turn out to be real numbers. Another cost structure, the roots of whose resulting 

auxiliary equation are complex, will then be briefly examined as a second ex- 
ample. 


It can be readily verified from equations (3.11) that X, = 1/x, and X 2 = 1/X,. To show 
that the assertion holds, we need only show t hat no root s have a modulus equal to unity. 
5*”*’ lf are , r ® a1, we k now that Vs, : (4 + «,) > « # > 0. It immediately follows 

at 4 5.., lj Xi > 1 (* = 3,4). Second, if the roots s,- are complex (conjugates), 
assume that the modulus of some X< (and hence all) is equal to unity. Write the roots 
X,- m tngonometnc form as cos * ± i sin *; then s,- = x, + 1/X,- - 2 = 2(cos <4 - 1) a real 
non-positive quantity But this isa contradiction. Since C lt , C 17 and (C u + C a ) all’possess’ 
the same sign, the s, have positive real parts. Therefore, none of the roots has a modulus 
equal to unity. Because the a,- are non-zero, the roots X x and X 2 are distinct unless Si = 

, which situation is possible only for the “hairline” case (C X6 + C*) 1 = 4C u Cir 

'^ e “ ec f ssary 811(1 sufficient condition for a unique solution is that the determinant of 
coefficients does not vanish. The value of this determinant is 


Cn(Xi — X2) 
(Xi X 2 )* 


[Ci 7 (\i + X* — 1) - (Ci9 - C n + Ca)XiXj. 


Since the roots have a modulus less than one, we know 


(l - xo(i - x 2 ) > 0. 


It follows that 


Xi + X 2 ~ 1 < XiX 2 < (1 + C 7 /C 3 )XiX 2 - 


C19 — C22 + Cn 
Cn 


X 1 X 2 . 


The determinant then vanishes if and only if Xi - X 2 , namely, if (C n + C 18 )» - 4C,,C l7 . 
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EXAMPLE 1 . REAL ROOTS 
The cost data employed in [ 2 ] were the following: 
Step 1: List of the Cost Data 0 


Cx = 

340. 

cv 

= .( 

c 2 = 

64.3 

C a 

= 320. 

Cz = 

.20 

Cz 

= 0 . 

c 4 = 

5.67 

Cu 

= 0 . 

Cz = 

51.2 

C n 

= 0 . 

Cz = 

281. 

C n 

= 0 . 


Next, we evaluate the derived coefficients (which were introduced in Section 2 
to simplify the notation): 

Step 2: Calculation of the Derived Coefficients 


C u 

— 2 C 3 C 4 — C12 

= 

2.268000 

Cu/Cu 

= (Cl - C«)/Ci4 

= 

26.014109 

Cu 

= 2Cz/Cu 

== 

56.701940 

Cu 

= 2 C 3 C//C 14 

= 

5.670000 

Cl7 

= CzCu/C-r 

= 

137.459248 

Cj8 

= (2 C z Cu ~ Cu)/2C 7 

= 

0. 

Cu 

= Ciz + C 18 + 2Cu + 3Ci7 

= 

531.451624 

C 20 

= Cu + 3Cit + C 18 

= 

469.079684 

C 21 

= Cl 6 + 4Cl7 + Cl8 

= 

606.538932 

C 22 

= Cie + 2 Cis + 2C« + 6 C 17 

= 

943.829368 

C 23 

= Ciz + 2Cl5 

== 

119.073880 


It is desirable to carry these and succeeding calculations to a large number of 
decimal places, in spite of inaccuracies in the original cost data, because there 
is a tendency for rounding errors to be exaggerated through the subtraction of 
numbers that are of the same order of magnitude. Upon completing the calcula¬ 
tions of the decision rules, the extra decimal places that cannot be justified in 
terms of the accuracy of the original cost estimates may be dropped. 

Step 8: Calculation of the Roots s. The next step is that of finding the roots of 
the auxiliary equation. When the equation is symmetric, as it is here, we have 
from (3.9): 


s = p-ff- [(C 15 + Ci&) ± "s/{C m s + Cyf) 2 — 4C r i6Ci7]> 
from which we obtain: 

= .242173, $2 = .170327. 

Step 4: Calculation of the Roots X. We can substitute these two values of s 
into equation (3.11a) 

10 Note that these values of the parameter satisfy the conditions of footnote 1 for an 
interior minimum of the cost function. 
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^ _ M[(2 + s) — Vs(4 + s)] ; 

yielding 

Xi = .614298, X 2 = .663762. 

Step 5: Check Substitutions into the Auxiliary Equations. That these roots 
satisfy the auxiliary equation (3.4) may be verified by direct substitution. We 

Ci7 — Ct xXi + CaaXi 2 — C»Xi 3 + C 17 \i = 0 
(137.459248) - (606.538932)(.614298) + (943.829368) (.377362) 

— (606.538932)(.231813) -j- (137.459248)(.142402) = —.000205 

Cn — C21X2 + C22X2 2 — C21X2 3 + C17X2 4 = 0 

(137.459248) - (606.538932) (.663762) + (943.829368) (.440580) 

- (606.538932)(.292440) + (137.459248)(.194111) = .000205 

Since .000205 is close to zero (and within the range of rounding errors) we can 
safely proceed to the next step. 

va ?y G-Tte Reduced System of Equations. We will next substitute the numerical 
values of X determined m Step 4 into equations (3.6), which are: 

(Cl* - Cn\i + Cn\i)Wi + Cn(l - Xi - 1 )W 2 

= [1 + C,(l - X,-)] [g Xr J O r ] + [Cu + Cn (1 - X,)]W 0 - Jo 

Performing the indicated arithmetic, we obtain the following equations in the 
two unlmowi. IT, «d ^> variables on Hie right-hand safe of the equatio^ 
u r {r 1 , z, 6, - • •), Wo, and Jo, are known. 

210.727868W a - 86.307088W 2 = ± xrU + 109.720247W 0 - 7 0 + 252.553865 

189.415925Wi - 69.6319071^ = g ^Or + 102.920963W 0 - 7 0 + 242.631859 

fJS 7: T !?- Sol y i ! m , of the Elions for w,. Several methods are available 
for solving this relatively simple system of equations. One convenient method is 
to eliminate the variable W 2 from the system as Mows. First multiply the first 
equation of Step 6 by the factor -(69.631907/86.307088) = -.806792- then 
add the second equation to this new one. Performing these operations and 
dividing by the resulting coefficient of W x , we obtain: 

00 

W ‘ ~ £ 1 --(MlSSar 1 + .051540xr‘]O, + .742153IT, - .009958 I. + 2.003536 
This is the employment decision rule, given by Equation 10 of [2], 
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Step 8: The Solution of the Equations for Wi . The value of Wi, from Step 7, 
can now be substituted into the first equation of Step 6 to obtain Wi. 

Wt = -.011587 (jL\r l o)j - 1.271329TF 0 

+ ,011587Io - 2.926342 + 2.441704PFi 

= £ [-.lmiSXx' -1 + .125845X2 r_1 ]O r + ,5407891To - .0127271,, + 1.965700 

1 

Step 9: Check Substitutions into the Equations. Again, it is advisable to check 
the work. Substituting the expressions for Wi and Wi into the lefthand side of 
the first equation of Step 6, we obtain the following expression which may then 
be compared for equality with the right-hand side of the equation. 


210.727868 


-86.307088 



(—.041582XT 1 + 


(-.lmisxr 1 + 


.051540X/ _l )a + .742153W 0 

-.0O9958I o + 2.003536 
.125845X^)0, + ,540789TF 0 

- .012727/o + 1.965700 


Simplifying, we obtain: 

E [1.000399Xi r_1 - .00040lX 2 r ~ l ]0, + 109.718395TT 0 - .999998/,, + 252.547027 

r*»l 

Proceeding similarly for the second equation of Step 6, we obtain: 

[.ooo329\r 4 + $ mm \ r l ] o r + 102.919428^0 - .999999/0 + 242.626185 

r—1 

Since the coefficients above agree with those of Step 6 (within the range of 
expected, rounding errors), we can proceed to the next step. 

Step 10: Solution for Pi . Equation (2.5) relates the optimal rate of production. 
Pi , to planned levels of the work force. Making the substitutions of the two 
work force rules, from Steps 2, 7, and 8, we can express the optimal production 
plan directly in terms of the initial conditions and the forecasts of incoming 
orders as: 

Pi = Ciq/Cu — Ci&W 2 4" C 23 "Wi Ciiffla 

= 26.014109 - 56.701940TF 2 + 119.0738801^1 - 56.7019401^0 

== S [1.462680\i r ~ x — ,998588X2 r_1 ]O r + 1.005312TFo 

- .464092/c + 153.123911, 

which is the production decision rule, Equation (11) of [2]. 
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TABLE 1 

Worksheet for Calculation of Weights 


Step 11, (Real Roots) 


Col. 1 

Col. 2 

Col. 3 

Col. 4 Weights for 
Work Force Rule 

Col. 5 Weights for 
Production Rule 

r 

X.'" 1 

x/- 1 

-.041582 X^ -1 

1.462608 Xj'"" 1 


1 

2 

+.051540 X 2 r “ 1 

-.998588 X 2 f ~ 1 

1 

1.000000 

1.000000 

.009958 

.464092 

2 

.614298 

.663762 

.008666 

.235696 

3 

.377361 

.440580 

.007016 

.112002 

4 

.231812 

.292440 

.005433 

.047041 

5 

.142402 

.194111 

.004083 

.014452 

6 

.087477 

.128844 

.003004 

-.000711 

7 

.053737 

.085522 

.002174 

-.006801 

8 

.033010 

.056766 

.001553 

-.008401 

9 

.020278 

.037679 

.001099 

-.007964 

10 

.012457 

.025010 

.000772 

-.006754 

11 

.007652 

.016601 

.000538 

- .005386 

12 

.004701 

.011019 

.000373 

-.004127 

Total.] 

2.592675 

2.974084 

.045476 

.822369 


Step 11: Calculation of Forecast Weights . The only remaining step is the 
calculation of the weights to be applied to forecasts of future orders for the 
work force rule (from Step 7) and for the production rule (from Step 10). Since 
these weights are linear combinations of successive powers of the roots X*, 
they may be computed on a relatively simple worksheet (Table 1). 

In the first column of Table 1 we have the index representing the number of 
time periods ahead. In Columns 2 and 3, the successive powers of the roots 
Xi and X 2 , respectively, are computed. The weight of the forecast of orders in 
the period for the labor force rule is given in Column 4 as —.041582 X/” 1 + 
.051540 X 2 r l , a weighted sum of the respective entries in the previous two 
columns (see Step 7). Similarly, the weights for the production rule in Column 
5 are 1.462608 xr 1 - .998580 X/ -1 (see Step 10). u 

EXAMPLE 2: COMPLEX ROOTS 

The computation of the decision rules from the cost function is somewhat more 
complicated if the roots of the auxiliary equation turn out at Step 3 to be complex 
numbers. To illustrate this case, we will change two of the parameters in the 

11 The weights given by Columns 4 and 5 of the worksheet are not identical with those 
given in [2]. In the previous article a small adjustment in the weights was made in order to 
shorten to 12 months the infinite forecast horizon which results from the theory which has 
been derived here. 
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previous example as follows: let Cs — 72.3375 and Cs — .2375 and carry through 
the modified computations. 

Step 2: Calculation of the Derived Coefficients . The derived parameters are 
almost all changed; they now become: 


C u = 2.693250 

Cm/Cm = 21.906618 

C« = 53.717627 

Cu = 5.670000 

Cn = 154.641648 


Cu = 0. 

Cu = 577.030198 
Cjio = 517.642571 
Ca = 672.284219 


Css = 1040.955142 
Css = 113.105254 


Step 8: Calculation of the Roots s. We determine the roots to the auxiliary 
equation, by substituting 12 in (3.9): 


Sl = .173684 + .080618i 


ss = .173684 - .080618i 


where i = y/ — 1- .... 

Upon obtaining the conjugate roots, s, the root with the positive imaginary 
part is designated Si and the root with the negative imaginary part Ss . In the 
succeeding calculation, the X root which corresponds to Si is designated Xi, 
and similarly for s 2 and X 2 . Attention to this notation is necessary to avoid errors 
of sign. 

Step 4A: Calculation of -\J s(4 + s). Equation (3.12) is a standard formula 
for expressing the square root of a complex number directly in terms of its real 
and imaginary parts. Since we need the square root of s(4 + s) we proceed as 
follows: 

s (4 + s ) = (.173684 ± .080618i')(4.173684 ± .080618i) = .718403 ± .350476t 
s x ± yi 

r = Vz 2 + y 2 = V (.718403) 2 + (.350476) 2 = .799335 
Vs(4 + s) = [y/r 4- x ± iV r — z] 

= .707107 (1.231966 ± .284486t) 

= .871132 ± .2011621 

b Xn the following calculations a knowledge of the routine manipulation of complex 
numbers is required. The essential operations are outlined below, but for explanation con¬ 
sult a textbook on college algebra or trigonometry, reference [5] for example. 

Addition: (a + bi ) + (c + di) = (a + c) -t- (b + d)i 

Multiplication: (a + bi) X (c + di) = (ac — bd) + (6c + ad)i 

1 (6 - ct) 

b + d 6 s + c 1 ' 


Division: 
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Step JfB: Calculation of the Roots \. Now we can substitute these values into 
equation (3.11a) to obtain: 

*■»' = MI(2 + sy) — \/Sj(4 + sy)] t = j = 1, 2 

= >£[(2.173684 ± .0806180 - (.871131 ± .201162*)] 

Xi = .651276 + (—.060272 )i = a + U 
X 2 = .651276 - (- .060272)*' = a-U 
These roots have the radius 

P = Va 2 + b 2 = sj (.651276) 2 + (.060272) 2 = .654059 
and argument 

* - " - 5 - 28720 OT ~ 5 ° 17SS ' 

so that 

Xi = p (cos <f> + i sin <f>) = .65406 (cos 5.29° — i sin 5.29°) 

X 2 = p (cos <f> — i sin <f>) — .65406 (cos 5.29° + i sin 5.29°). 

The calculation of the check substitutions, Step 5, is left to the reader. 

Step 6A: The Reduced System of Equations. Substituting the values of X 
given by Step 4 into a modified version of equation (3.6) we obtain the system: 

(Cis C 21 X* + Cn\f)Wi + Ci7(l — X% 1 )T? r 2 

= [1 + C 9 (l — X,)] p^fcos (r — 1 )<j> ± i sin (r — l)^]O r | 

+ + Cn(l - X,)]TTo - J„ + \c s - 1 . 

L W4(l — X»)J 

(204.218764 ± 28.379463*)^ + (-80.786807 =F 21.787616t)TF 2 

co 

= 2 P^fcos (r — 1)<£ ± i sin (r — l)<rfO r 

r=l 

+ (107.644881 ± 9.32056U*)TF 0 - J 0 + (259.002687 ± 10.542516x) 

S/ep &B; TTie Equations Involving Only Real Coefficients. Since the real and 
imaginary parts of the above equations must each be equal, we can equate these 
two parts separately in order to obtain the following system which involves only 
real coefficients: 

204.218764^ - 80.786807PF* = £ [ P r ~ l cos (r - 1 )<*]0 r 

r=l 

+ 107.644881JF 0 - 7 0 + 259.002687 
28.379463IFi - 21.787616TF 2 = Z [p' -1 sin (r - l)^,]0 r 

r—1 


+ 9.320561TF 0 + 10.542516 
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TABLE 2 


Worksheet for Calculations of Weights 
Step 11, (Complex Roots) 


Col. 1 

7 

Col. 2 

Col. 3 

Col. 4 

Col. 5 Weights For 
Work Force Rule 

Col 6 Weights for 
Production Rule 

r 

cos (r — l)<f> 

sin (r — 1 )<f> 

r- 1 

P 

[.010102 cos (r — 1 )<f> 
-.037457 sin (r - 1 )*] 

x/- 1 

[.435773 cos (r - 1 )<f> 
-K 849670 sin (r — 1)^1 

x/- 1 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

11 

12 

1.00000 

.99574 

.98307 

.96192 

.93263 

.89800 

.85060 

.79854 

.73967 

.67450 

.60361 

.52753 

.00000 

- .09214 
-.18353 
-.27332 

- .36078 
-.44001 
-.52582 
-.60192 

- .67297 
-.73828 
-.79729 
-.84948 

1.000000 

.654059 

.427793 

.279802 

.183007 

.119697 

.078289 

.051206 

.033492 

.021906 

.014328 

.009371 

.010102 

.008837 

.007189 

.005583 

.004197 

.003059 

.002215 

.001568 

.001095 

.000755 

.000515 

.000348 

.435773 

.232602 

.116554 

.052308 

.018277 

.002090 

- .005958 

- .008370 
-.008355 

- .007303 
-.005938 
-.004609 

Total. 

— 

— 

2.867573 

.046154 

| .804475 


Step 7: Solution of the Equations for TFi. Eliminating Wi from the equations 
above, we obtain the work force rule: 

Wi = L p ,-1 [.010102 cos (r - l)<f> — .037457 sin (r — l)<rfO r 

r-1 

+ ,7383O4W 0 - .010102Jo + 2.221549 

The next three steps, 8, 9, and 10, are basically the same as those of the pre- 

ceding case. . , , 

Leaving the detail of these steps to the reader, we report the production rule 

that is obtained: 


Pi = L p' -1 [-435773 cos (r - l)<t> + .849670 sin (r - 1)<I>]0 T 


r—1 


+ 1.1100971^0 — .435773Jo + 143.729914. 

Step 11: Calculation of Forecast Weights. The worksheet for calculating these 
weights now requires more columns than previously (Table 2). e weig s or 
the work force rule and the production rule are given, respectively, in Columns 
5 and 6 which are computed from the first three columns. . . 

This completes the computation procedure for obtaining the two decision 

rules. 
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5. Conclusion 

Once these decision rules have been obtained, computation of the production * 
ami employment schedule for any production period requires but a few minutes. 
These rules may continue in use unchanged as long as there is no significant 
change in the cost structure. 13 

For the quadratic cost function in the decision variables, W r , and P r (r = 1 , 

2, * *), and initial conditions and forecasts, we have obtained the first order 

conditions for a minimum cost. The solution of these conditions then yielded for 
the next period the optimal level of the work force, W x , and the optimal rate of 
production, P x , as functions of the forecasts of orders and the levels of inven¬ 
tories and employment at the beginning of that period. We finally presented a 
procedure for computing the solution, and this procedure was illustrated with 
the cost parameters developed in the previous article. 

The certainty equivalence property enables us to use these solutions, which 
were derived for “known” future order receipts, as decision rules for the usual 
situation in which forecasts of future order receipts are subject to error. The 
action that is indicated by the decision rule for the first period is optimal in the 
sense that it is the best action that can be taken on the basis of information 
currently available. The tentative plans for future actions based on present s 
information may also be obtained, but, of course, these plans will undoubtedly 
be modified as new information becomes available before they are put into effect. 

A forecasting method should be used whose expected error is zero, or more loosely, 
whose algebraic average error is zero. Of course, the more accurate the forecasts, 
the better the decisions and the lower the costs. 

Although the procedures presented here and in [2] were developed for a par¬ 
ticular factory, they should be of immediate usefulness in facilitating production » 
and employment decisions elsewhere. 14 Moreover, the general techniques for ob- , 
taming decision rules from quadratic criterion functions is applicable to many f 
other decision problems. 
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DYNAMIC VERSION OF THE ECONOMIC 
LOT SIZE MODEL*f 

HARVEY M. WAGNER and THOMSON M. WHITIN 


Stanford University and Massachusetts Institute of Technology 

A forward algorithm for a solution to the following dynamic version of the 
economic lot size model is given: allowing the possibility of demands for a 
single item, inventory holding charges, and setup costs to vary over N periods, 
we desire a minimum total cost inventory management scheme which satis¬ 
fies known demand in every period. Disjoint planning horizons are shown 
to be possible which eliminate the necessity of having data for the full 1 V 
periods. 

1. Introduction 

_ now the square root formula” [7] (equation 8 below) for an economic lot 
size under the assumption of a steady-state demand rate is well known. The 
calculation is predicated upon a balancing of the costs of holding inventory 
against the costs of placing an order. When the assumption of a steady-state 
demand rate is dropped—i.e., when the amounts demanded in each period are 
known but are different—and furthermore, when inventory costs vary from 
period to period, the square root formula (applied to the overall average demand 
and costs) no longer assures a mi n i mu m cost solution. We shall present a simple 
algorithm for solving the dynamic version of the model. 1 

The mathematical model may be viewed as a “one-way temporal feasibility” 
problem, in that it is feasible to order inventory in period t for demand in period 
* + * but not vice versa. This suggests that the same model also permits an 
alternative interpretation as the following “one-way technological feasibility” 
problem [1], Suppose a manufacturer produces an item having N possible values 
for a certain critical dimension; for example, the item may be steel beams of 
various strengths. He anticipates a known demand schedule for the N types of 
steel beams, and it is feasible to substitute a beam of strength g x for a beam of 
strength g 2 if and only if gi> gt. Producing each kind of a beam requires a setup 
cost, and using a beam in excess of the required strength incurs a charge in terms 
of wasted steel. The operator of the steel mill wishes to know how many beams 
of each type to produce in order to minimize total costs. 


2. Mathematical Model 

As in the standard lot size formulation, we assume that the buying (or manu¬ 
facturing) costs and selling price of the item are constant throughout all time 


* Received February 1958. 

t This report was sponsored in part by the Office of Naval Research. 
f.n.JrT*- We ^ discussed a further generalization in which period sales are 
purchased^ Pn ° e ’ ^ C0StS ^ ^ necessarily Proportional to output or the amoun 

w’J! 6 aremdebted Professor W. Sadowski, Central School of Planning and Statistics 
Warsaw University, who suggested this application. 
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periods, and consequently only the costs of inventory management are of con¬ 
cern. In the t- th period, t = 1,2, * • * , N, we let 
d t = amount demanded 

it = interest charge per unit of inventory carried forward to period 

t+l 

St = ordering (or setup) cost 
x t = amount ordered (or manufactured). 3 
We assume that all period demands and costs are non-negative. The problem 
is to find a program x t ^ 0 , t = 1 , 2, * • • , N, such that all demands are met 
at a minimum total cost; any such program, which need not be unique, will be 
termed optimal. 

Of course one method of solving the optimization problem is to enumerate 
2*~ 1 combinations of either ordering or not ordering in each period (we assume 
an order is placed in the first period). 4 A more efficient algorithm evolves from a 
dynamic programming characterization of an optimal policy [2, 3, 4]. 

Let I denote the inventory entering a period and lo initial inventory; for 
period t 

I = /o + 2 a* - 2 dj £? 0. (1) 

^ 7=1 7=1 

We may write the functional equation [2, 4] representing the minimal cost policy 
for periods t through N, given incoming inventory I, as 

ft CD = min [if— 1 1 + &(x t )s t + (I + x t — d t )\ (2) 

I+J^>dt 

where 


5(x<) = 


0 if x t = 0 
1 if x t > 0. 


(3) 


In period N we have 

U(I) = min fo-i I + 5fer)s*]. (4) 

r-hcjy=<ljy 


Consequently we compute ft , starting at t = N, as a function of I ; ultimately 
we derive/i, thereby obtaining an optimal solution as I for period 1 is specified. 
Theorem 2 below establishes that it is permissible to confine consideration to 
only N + 2 — t, t > 1, values of I at period t. 

By taking cognizance of the special properties of our model, we may formulate 
an alternative functional equation which has the advantage of potentially re¬ 
quiring less than N periods’ data to obtain an optimal program; that is, it may 


3 We confine ourselves, as one does in the static model, to situations in which (nearly) 
constant lead or delivery time is a workable approximation to reality. 

4 Formally the model may also be posed as a fixed charge linear programming problem; 
see W. M. Hirsch and G. B. Dantzig, “The Fixed Charge Problem,” RAND Corporation 
RM-1383, December 1954. 
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be possible without any loss of optimality to narrow our program commitment 
to a shorter “planning horizon” than N periods on the sole basis of data for this 
horizon. Just as one may prove that in a linear programming model it suffices 
to investigate only basic sets of variables in search of an optimal solution, we 
shall demonstrate that in our model an optimal solution exists among a very 
simple class of policies. 

It is necessary to postulate that di ^ 0 is demand in period 1 net of starting 
inventory. Then the fundamental proposition underlying our approach asserts 
that it is sufficient to consider programs in which at period t one does not both 
place an order and bring in inventory. 

Theorem 1. There exists an optimal program such that Ix t = 0 for all t (where 
I is inventory entering period t). 

Proof: Suppose an optimal program suggests both to place an order in period 
t and to bring in I (i.e., Ix, > 0). Then it is no more costly to reschedule the 
purchase of / by including the quantity in s,, for this alteration does not incur 
any additional ordering cost and does save the cost it-il ^ 0. 

Note that the theorem does not hold if our model Includes buying or pro¬ 
duction costs which are not constant and identical for all periods. In the latter 
case economies of scale might very well call for the carrying of inventory into 
period t even when an order or setup takes place in t [6]. 

Two corollaries follow from the theorem. 

Theorem 3. There exists an optimal program such that for all t 
x t - 0 or Yii-t d } for some k, t <L k N. 


Proof: Since all demands must be met, any other value for x t implies there 
easts a period t* £ t such that Ix t , > 0; but Theorem 1 assures that it is suffi¬ 
cient to consider programs m which such a condition does not arise. 

The implication of Theorem 2 is that we can limit the values of / in (2) for 
period t to zero and the cumulative sums of demand for periods t up to N If 
initial inventory is zero, then onlyiV(N + l)/2 different values of / in toto over 
tiie entire N periods need be examined. 


Theorem 3. There exists an optimal program such that if d t , is satisfied bv 
some x t »,t** < t*, then d t = + 1, i, is a!so satisfied by * 

rroof. in a program not satisfying the theorem, either I for period t** is 
positive or I for period t* is brought into some period t', t** < t’ < t*, where 

x„ > 0; but again by Theorem 1, it is sufficient to consider programs in which 
such conditions do not arise. 


1 i/thfwi Tf T h “ ° by . nettmg out startin K inventory from demand in period 

‘ If f Of starting inventory m fact exceeds the total demand in period 1 then the 

forward algorithm to be suggested may not be correct. In particular, Theorem 1 below 

to the 0 ^ 0 ! P en ° d 1; in such a case (2) still remains applicable. A sufficient condition 
to the existence of a forward solution is that s, is monotonically non-increasing An optimal 
solution is found then by using up initial inventory period by period unti at's^Te JThe 

zsr"‘ not “»* ,is ** '*»•»» -w-w.s.” to; 
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We next investigate a condition under which we may divide our problem into 
two smaller subproblems. 

Theorem 4. Given that 1 = 0 for period t, it is optimal to consider periods 1 
through t — 1 by themselves. 

Proof: By hypothesis, (2) in period t — 1 for the N period model is 

/*-iC0 = min [i t - 2 l + 8(x t -i)s t -i + fM], (5) 

and for the t — 1 period model is correspondingly 

gt-i(I) = min [it-il + (6) 

1^0 

But the functional relations (5) and (6) differ only by a constant /*(0). Conse¬ 
quently what is optimal for (6) is optimal for (5), and by the recursive structure 
of the model, the latter conclusion continues to hold for all the earlier periods. 

We may now offer an alternative formulation to (2). Let F(t) denote the 
minimal cost program for periods 1 through L Then 


F{t) = min 


min 

i£/<* 


t—i t 

2 ^2 ikdk + F(j — 1) 

A-i k=h+l 


s t + F(t - 1) 


(7) 


where F( 1 ) = s x and F( 0) = 0. That is, the minimum cost for the first t periods 
comprises a setup cost in period j, plus charges for filling demand d k , 
k = j + 1, • * • , t, by carrying inventory from period j, plus the cost of adopting 
an optimal policy in periods 1 through j — 1 taken by themselves. Theorems 2, 3, 
and 4 guarantee that at period t we shall find an optimum program of this type. 
With the present formulation, (7) is computed, starting at t = 1. At any period 
t, (7) implies that only t policies need to be considered. The mirnmmn in (7) 
need not be unique, so that there may be alternative optimal solutions. When 
we derive F(N), we shall have solved the problem for N is the last period to be 
considered. 

Finally we come to what is perhaps the most interesting property of our 
model. 

The Planning Horizon Theorem. 6 If at period t* the minimum in (7) occurs 
for j = t** ^ t*, then in periods t > t* it is sufficient to consider only 
l** ^ j ^ t. In particular, if t* = t**, then it is sufficient to consider programs 
such that x t * > 0. 

Proof: Without loss of optimality we restrict our attention to programs of 
the form specified in Theorems 1-4. Suppose a program suggests that d t is satis¬ 
fied by x t *** , where t*** < t** S t* < L Then by Theorem 3 d t * is also satisfied 
by x t *** . But by hypothesis we know that costs are not increased by rescheduling 
the program to let d t * be satisfied by x t ** > 0. 


6 The reader may wish to prove the corresponding theorem for (2): Let I** be the value 
of incoming inventory associated with min//*•(/); then in period t < t* it is sufficient to 
consider only 0 £1^1**+ 2/I7 1 d . In particular, if /** = 0, then it Is sufficient to 
consider programs such that I = 0 at period t*. 
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The planning horizon theorem states in part that if it is optimal to incur a 
setup cost in period t* when periods 1 through t* are considered by themselves, 
then we may let x t * > 0 in the N period model without foregoing optimality. 
By Theorems 1 and 4 it follows further that we may adopt an optimal program 
for periods 1 through 2* — 1 considered separately. 

3. The Algorithm 

The algorithm at period t*, t* = 1, 2, • • • , N, may be generally stated as 

1. Consider the policies of ordering at period t**, t** = 1,2, • • • , t*, and filling 
demands d t ,t = i**, t** + 1, • • • , t* } by this order. 

2. Determine the total cost of these t* different policies by adding the ordering 
and holding costs associated with placing an order at period t**, and the cost of 
acting optimally for periods 1 through t** — 1 considered by themselves. The 
latter cost has been determined previously in the computations for periods 
* = 1, 2, 1. 

3. From these t* alternatives, select the minimum cost policy for periods 1 
through t* considered independently. 

4. Proceed to period t* + 1 (or stop if t* = N). 

Table 1 portrays the symbolic scheme for the algorithm. The notation 
(1,2,.-., <*♦) t** + 1, t** + 2, . >. , t* in Table 1 indicates that an order is 
placed in period t** + 1 to cover the demands of d t , t = t** + 1, t** + 2, ... , t*, 
and the optimal policy is adopted for periods 1 through t** considered separately. 
At the bottom of the table we record the minimum cost plan for periods 1 
through £*. 

In general, it may be necessary to test N policies at the A-th period, implying 
a table of N(N + l)/2 entries (versus for all possibilities). Thus the forward 
algorithm (7) is at least as efficient as (2). As we shall see, the number of entries 
usually is much smaller than this number if we make full use of the planning 
horizon theorem. 


TABLE 1 


Month i 

i 

2 

3 

4 


N 

Ordering cost 
Demand 

Si 

di 

S2 

d 2 

St 

dt 

54 

di 

... 

S*r 

djf 

(1,2, ••• 

(1, 2, —2) 
t — 1, t 

(1,2, ,i — 3) 

t - 2, t - 1, t 

1 

(D2 

12 

(1, 2)3 
(i m 

123 

a, 2 , 3)4 

(1, 2)34 

(1)234 

1243 


(1, 2, ••• ,N - 1)N 
(1,2, ••• ,2V - 2) 

JV - 1, N 

(1,2, ••• ,N - 3) 

JV - 2, N - 1, N 

(1,2, •••,*- 4) 
t~ 3, ,t 

(1,2, ••• , JV - 4) 

JV - 3, JV - 2, JV - 1, JV 

Minimum cost 
Optimal policy 
(1,2, .-.,0 

(1) 

(1, 2) 

(1,2,3) 

(1, 2, 3, 4) 


(1,2, ••• , JV) 
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4. An Example 

Table 2 presents a sample set of data for a 12 month period; to simplify com¬ 
putations we have let i t = 1 for all t; Table 3 contains the specific calculations. 

To illustrate, the optimal plan for period 1 alone is to order (entailing an 
ordering cost of 85), Table 3. Two possibilities must be evaluated for period 2: 
order in period 2, and use the best policy for period 1 considered alone (at a 
cost of 102 + 85 = 187); or order in period 1 for both periods, and carry in¬ 
ventory into period 2 (at a cost of 85 + 29 = 114). The better policy is the 
latter one. In period 3 there are three alternatives: order imperiod 3, and use the 
best policy for periods 1 and 2 considered alone (at a cost of 102 +■ 114 = 216); 
or order in period 2 for the latter two periods and use the best policy for period 1 


TABLE 2 


Month 

t 

dt 

St 

it 

1 

69 

85 

1 

2 

29 

102 

1 

3 

36 

102 

1 

4 

61 

101 

1 

5 

61 

98 

1 

6 

26 

114 

1 

7 

34 

105 

1 

8 

67 

86 

1 

9 

45 

119 

1 

10 

67 

110 

1 

11 

79 

98 

1 

12 

56 

114 

1 

Average. 

52.5 

102.8 

1 


TABLE 3 


Month ^ 

i 

2 

■y 

4 

5 

6 

7 

8 

9 

10 

li 

12 

Ordering cost 

j 

85 

102i 

102 

101 

98 

114 

105 

86 

119 

110 

98 

114 

Demand 0 

69 


36 

61 

61 

26 

34 

67 

45 

67 

79 

56 


85 

w 

216 

287 

375 

462 

505 

555 

674 

710 

808 

903 



114 \ 

223 

277 

348 

401 

496 

572 

600 

741 

789 

864 



s w' 

186 



400 

469 



734 


901 








502 






Minimum cost 

85 

114 

186 

277 

348 

400 

469 

555 

600 

710 

789 

864 

Optimal policy* 

1 

12 

123 

34 

45 

456 

567 

8 

89 

10 

10,_11 

11, 12 


* Only the last order period is shown; 667 indicates that the optimal policy for periods 
1 through 7 is to order in period 6 to satisfy d 6 , d 6 , and d 7 , and adopt an optimal policy 
• for periods 1 through 4 considered separately. 
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considered alone (at a cost of 102 + 36 + 85 = 223); or order in period 1 for 
the entire three periods (at a cost of 85 + 29 + 36 + 36 = 186). 

In our example, it is clear that it would never pay to carry inventory from 
periods 1 or 2 to meet , since the carrying charges would exceed the ordering 
cost in period 4. A fortiori it would never pay to carry inventory from periods 1 
or 2 to meet d&, d$, * * * , ds , because to do so would also imply that inventory 
was being carried to period 4 (Theorem 3). 

Note that periods 1 through 8, and 8 through 10 comprise planning horizons. 
Whenever a time horizon (or a simplification of the type mentioned in the previ¬ 
ous paragraph) arises, the entries in the table can be truncated below the south¬ 
east diagonal through the entry for (1, 2, * * • , £* —■ 1)^, as we have done in 
Table 3. 

For our set of data the optimal policy is 

1. Order at period 11, Xn = 79 + 56 = 135, and use the optimal policy for 

periods 1 through 10, implying 

2. Order at period 10, x X q = 67, and use the optimal policy for periods 1 

through 9, implying 

3. Order at period 8, x$ = 67 + 45 = 112, and use the optimal policy for 

periods 1 through 7, implying 

4. Order at period 5, z 5 = 61 + 26 + 34 = 121, and use the optimal policy 

for periods 1 through 4, implying 

5. Order at period 3, Xz = 36 + 61 = 97, and use the optimal policy for periods 

1 through 2, implying 

6. Order at period 1, x x = 69 + 29 = 98. 

The total cost of the optimal policy is 864. 

By use of the suggested tabular form, it is also relatively easy to make sensi¬ 
tivity analyses of the solution. For example, the ordering cost in period 2 would 
have to decrease by more than 73 in order to make it less costly to setup in 
period 2 than carry inventory from period 1; ordering cost in period 11 would 
have to increase by more than 37 in order to make it less costly to order in period 
10 for the last three periods. 

5. A Steady State Example 

In the case of steady state demand and constant ordering and holding costs, 
our algorithm yields the same result as the standard “square root formula.” 
Assume that throughout the entire year monthly demand d = 52.5, ordering 
(setup) cost s = 102.80, and interest charge i = 1. The square root formula 
for the order quantity gives 

Q = V2 ds/i = V2 X 52.5 X 102.8/1 = 104. (8) 

Since this is approximately two months demand, we round Q to 105 units for 
comparison purposes. 

Applying our algorithm yields that for the first two and three periods, the 
optimal policies are 12 and (1, 2)3, indicating that the first two periods comprise 
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a planning horizon. In the steady state case, all planning horizons are the same, 
i.e., orders will be placed every two months. Therefore annual costs are easily 
obtained as six times the costs for one planning horizon, amounting to 931.80. 

Annual total variable costs may be calculated with the standard lot size 
model as 7 

12[(Q - d)i /2 + ds/Q] = 12[(105 - 52.5)1/2 + 52.5(102.80)/105] 

= 931.80. 

Thus the two models are equally as costly. If the square root formula had 
not resulted in the ordering of an integral number of months’ supply, the costs 
under the two methods would have been different due to the discrete division of 
time in our model. However this difference vanishes once the length of our time 
period is reduced. 
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This paper studies the planning problem faced by a machine shop required 
to produce many different items so as to meet a rigid delivery schedule, re¬ 
main within capacity limitations, and at the same time minimize the use of 
premium-cost overtime labor. It differs from alternative approaches to this 
well-known problem by allowing for setup cost indivisibilities. 

As an approximation, the following linear programming model is suggested: 

Let an activity be defined as a sequence of the inputs required to satisfy the 
delivery requirements for a single item over time. The input coefficients for 
each such activity may then be constructed so as to allow for all setup costs 
incurred when the activity is operated at the level of unity or at zero. It is then 
shown that in any solution to this problem, all activity levels will turn out to 
be either unity or zero, except for those related to a group of items which, in 
number, must be equal to or less than the original number of capacity con¬ 
straints. This result means that the linear programming solution should pro¬ 
vide a good approximation whenever the number of items being manufactured 
is large in comparison with the number of capacity constraints. 

1. Background 

It is common knowledge that the presence of “setup costs” in a manufactur¬ 
ing process raises questions of indivisibilities [4], and that such indivisibilities 
constitute a formidable obstacle to any attempt to phrase economic lot size 
problems in terms of linear programming. In economists’ language, this amounts 
to saying that the presence of economies of scale contradicts the assumptions 
of marginal analysis, along "with such economic theories as linear programming, 
which are so deeply rooted in marginalism. 

This paper reports upon the successful use of linear programming in a special 
instance involving setup costs and economic lot sizes. 3 Unlike a number of the 

* Received June 1957. 

1 Research undertaken by the Cowles Foundation for Research in Economics under 
Contract Nonr-609(01) with the Office of Naval Research. 

2 The author is deeply indebted to E. Greenwood, general manager, and also to C. Cor- 
rell, A. Pastick, and A. Goldman, all of Norden-Ketay, Inc., Milford and Stamford, Con¬ 
necticut. 

2 If the total costs of producing x units of a particular item in a single lot are given by: 
aS -f- bx 

/>°\ l~ l 

where x I J implies S I 

\- 0 / \«0 

then the constant a is said to represent the “setup cost” for that item and the constant 
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more recent proposals [e.g., 1], the particular model is a non-stochastic one, 
and in this sense is of less general applicability. The distinctive feature of the 
approach outlined here is that capacity limitations—and hence interdependence 
between individual items—are treated as an explicit part of the economic lot 
size decision. Since the background of the individual problem makes it possible 
to justify a number of simplifications in the mathematical model, it seems 
worthwhile to summarize the leading features of that background: 

The plant in question sells most of its output to the armed services of the 
United States. The “end items” of this plant are in turn an input for other 
manufacturing activities, and the timing of deliveries takes on even greater 
importance here than in the routine production of consumers’ goods. Indeed, 
in his competitive bid for any particular product, the manufacturer stipulates 
not only the price at which he will undertake to produce the item, but also the 
dates at which individual units will be delivered to the Government. Because 
the actual timing of deliveries affects the manufacturer’s reputation—hence 
his ability to obtain future contract awards—one assumption that underlies 
all production planning is that the manufacturer will adhere to the promised 
delivery dates. 

Since the plant does not produce large quantities of any individual end item, 
the productive process is of the batch type rather than continuous. As is typi¬ 
cal of many metal-working establishments, the first step is to produce individ¬ 
ual parts in the plant’s own machine shop and to procure certain parts from 
other manufacturers. Once all the parts for a particular finished unit are on 
hand, these are assembled, tested, and the item is turned over to the Govern¬ 
ment. Note that if the Government contract calls for delivery of 25% of the 
finished units over each of four successive months, only 25% of the total re¬ 
quirement for each individual part needs to be available at the time actual 
assembly is initiated. 4 This means that one of the significant choices to be made 
is that of splitting production lots for individual parts so as to meet the initial 
delivery requirements, but still defer a portion of the machining work until 
the latter part of the delivery cycle. Lot-splitting does, of course, bring about 
an increase in setup costs, and so an optimum lot size decision entails an eco¬ 
nomic balance between the advantages of reducing setup costs, and the ad¬ 
vantages of smoothing out the production program over time. 

Although there is some overlap between planning for the machine shop and 
for the final assembly and testing operations, this paper is primarily concerned 
with the machine shop itself, and only to a secondary extent with the problems 
created by this overlap. Actual planning of the machine shop’s activities takes 
place at two distinct echelons of the plant’s management, and correspondingly 
at two different levels of abstraction. Short-range scheduling is concerned solely 

b the “incremental unit cost.” Evidently setup costs are minimized by concentrating an 
entire production requirement into a single lot, rather than by splitting up that require¬ 
ment among several lots. 

4 In practice, “buffer stock” considerations may indicate that more than 25% of the 
total for each part ought to be available before final assembly is initiated. 
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with such details as which parts are to be manufactured, and which individuals 
and machines are to be used. Long-range scheduling (up to eighteen months 
ahead) is concerned with the general problem of whether the machine shop’s 
existing resources wall be able to meet the company’s future delivery com¬ 
mitments, and if not, what policies should be adopted to supplement the exist¬ 
ing resources: overtime work, recruiting and training of additional personnel, 
and outside procurement of certain parts. 6 This paper is concerned almost en¬ 
tirely with the long-range problem, as distinct from the day-to-day operation. 

Within the company, the traditional procedure for long-range production 
planning has emphasized calculations made upon the assumption that each of 
the parts was to be run off without splitting any of the lots—despite the fact 
that lot-splitting is far from a rare occurrence. Once the simplifying assumption 
is made, it is then largely a matter of arithmetic to take the end item delivery 
schedules, pool this information with the “operation sheet” machining time 
estimates, and arrive at the man-hour requirements for machining during each 
of the months in which parts are to be produced for a particular end item. 
Given these estimates of requirements, in turn it is possible to subtract off the 
straight-time man-hours available from the existing work force, and come up 
with a figure for the deficit or surplus of manpower over requirements. In case 
of an impending deficit, it is up to the long-range planning group to recom¬ 
mend whether to order overtime work, to attempt outside procurement for 
certain of the parts that would normally have been made in the company’s 
own plant, or to alter the initially stipulated schedule of parts deliveries to the 
final assembly operation. In practice, of course, a tight scheduling problem will 
force the planning group to depart from the assumption of no split lots, and 
thereby to depart from minimizing setup costs. 

In the linear programming calculations—just as with current methods of 
scheduling two simplifying features of this particular manufacturing opera¬ 
tion are exploited. Neither is essential to the use of linear programming, but 
both are highly convenient for expository purposes: (1) Limitations on the 
availability of specific machines have been disregarded. It will ordinarily be 
true that if a particular production plan stays within the limitations of the man¬ 
power available with a particular time period, the plan will also be within the 
capabilities of the plant’s machine tool equipment. (2) Inventory-holding costs 
have been neglected. Physical storage costs are quite low, and the contractual 
arrangement of Government “progress payments” makes the interest cost 
element a minor one. 

2. A linear programming formulation 

The linear programming model of the machine shop’s operations is intended 
to provide numerical answers to the following general problem: Given a large 

5 Decisions on the purchase of new equipment constitute an additional degree of free¬ 
dom, but since the payout period for such equipment normally extends over several years, 
the company’s policies on equipment purchase may be regarded as fixed—at least as far 
as an eighteen-month production schedule is concerned. 
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number of individual parts to be machined, and given delivery requirements 
for each of these parts over a series of time periods, determine how many of 
each of the parts should be machined in each time period—taking account of 
the fact that there are limits upon the amount of straight-time and of overtime 
productive capacity available during the individual periods, and also that lot¬ 
splitting increases the total amount of setup time required. 

In determining an output schedule, the objective is assumed to be the mini¬ 
mization of overtime labor requirements. This criterion for choice among al¬ 
ternative production plans implies: (a) that the straight-time services from 
the projected work force represent a fixed commitment on the company’s part, 
and that nothing can be saved by failing to use up these services; (b) that the 
total labor requirements fall within the man-hours available from straight-time 
plus overtime work so that the question of outside procurement does not arise; 6 7 
and (c) that the only remaining variable costs are those that increase with the 
total number of overtime man-hours. 

Underlying this linear programming formulation is the definition of an ac¬ 
tivity as a sequence of inputs over time that satisfies the delivery requirements 
for a particular part. (Individual parts are distinguished from one another by 
the subscript i, and the alternative sequences for the t-th part by the sub¬ 
script ij.) Since, in general, there will be more than one sequence that is feasi¬ 
ble from the viewpoint of delivery requirements for the i -th part, the linear 
programming variables refer to the fraction of the total requirement for the 
f-th part that is supplied by the j -th sequence of inputs for that part. Although 
no physical meaning can be attached to a fractional value of (e.g., a solu¬ 
tion specifying that half the requirements for a given part are to be met by a 
one-lot sequence of output and half by a split-lot sequence), there is no guar¬ 
antee that such proper fractions will be absent from a linear programming 
solution. All that can be guaranteed is that if there are T time periods distin¬ 
guished within the model, there will be at most T parts for which the Xij frac¬ 
tions turn out to be intermediate between zero and one. (A proof of this as¬ 
sertion is given in section 5.) 

6 From the viewpoint of model formulation, the question of outside procurement is an 
inessential complication. Ordering from an outside supplier differs from internal produc¬ 
tion only in that it costs money and imposes no drain upon the internal availability of 
labor. 

From this same viewpoint, the question of recruiting and training new personnel is 
also an inessential complication. An activity of this sort could be incorporated directly 
within the model—provided that the training cost per man was known. 

7 For the special but nonetheless interesting case in which delivery requirements recur 
at a steady rate over the indefinite future, there is a meaningful interpretation that can 
be attached to fractional values of Xi 2 —i.e., that the actual lot size be intermediate be¬ 
tween the quantities specified in the initial definition of the Xi 2 alternatives. With this 
interpretation, the non-linear inventory problem with storage and capacity restrictions 
described by Rifas in the Churchman, Ackoff, and Arnoff volume [2, ch. 10] can be trans¬ 
formed into a straightforward exercise in linear programming. 
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Unknowns, coefficients, and constants for the programming model are de¬ 
fined in the following way: 

(a) unknowns 

Xij = fraction of the total requirement for the 2 -th part to be supplied by the 
j-th alternative sequence of inputs, (i = 1, • * * , I; j = 1, • • * , /.) 

It = number of hours of overtime labor required during time period /(/ = 

s t = “slack” variable for straight-time labor during time period /. (all t) 

v t = “slack” variable for overtime labor during time period /. (all /) 

(b) coefficients 

fiat = labor input required during period t in order to carry out the j- th al¬ 
ternative production sequence for part i. (all 2 , j, and t) 

(c) constants 

St = maximum availability of straight-time labor man-hours during the /-th 
time period, (all t) 

V t = maximum availability of overtime labor man-hours during the /-th time 
period, (all t) 


With these definitions, the linear programming model becomes: 


(2.1) 

subject to: 

Minimize ^2 l t 

t 



(2.2) 

Xij == 1 
i 

(*=!,••• 

,D 

(2.3) 

jL< ftijtXjj It St — St 

i i 

(<=!,••■ 

, T ) 

(2-4) 

It + V t - Vt 

(t= 1, • • - 

, T) 

(2.5) 

0 

All 

» 

5 ? 

(all i , 

j, t) 

Expression 

(2.1) indicates the minimand—the sum 

of overtime labor 

re- 


quirements—and conditions (2.2)-(2.5) list the constraints that must be satis¬ 
fied by the unknowns , U , s*, and v t . Equations (2.2) say that the total 
requirement for the 2 -th part must be met by a combination of one or more se¬ 
quences of production for that part. (2.3) ensures that within each time period 
the total number of man-hours required to satisfy the individual output pro¬ 
grams will not exceed the amount available of straight-time labor plus the over- 
txme to be ordered for that period. Equations (2.4) place upper bounds upon 
the use of overtime labor. And finally, conditions (2.5) impose the usual non¬ 
negativity requirements upon all unknowns. 

To define more precisely what is meant by the x {j variables and the co¬ 
efficients, it is easiest to refer to a three-period numerical example. (T = 3.) 
Let the specific part under discussion be part 1, (2 = 1), and let the deliveries 
scheduled, the setup time, and the incremental labor requirements for that 
part be: 

ai = 10 man-hours = setup time for part 1. 
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bi = .9 man-hours/part = incremental labor required per unit of 

output of part 1. 

Rn — 30 units = delivery requirements for part 1 at end of period 1 

Rn = 30 units = delivery requirements for part 1 at end of period 2 

Rn = 40 units = delivery requirements for part 1 at end of period 3 

With these numerical values, there are exactly four alternative sequences for 

labor input and parts output that need to be considered explicitly within a 
linear programming model. 8 These four are distinguished from one another by 
the index j. 


j index. 

l 

2 

3 

4 

Number of separate lots. 

i 

2 

2 

3 

Delivery requirement to be 

Ru 4 - Rn -j- Ru 

irC 

II 

CO 

o 

Ru 4" R12 = 

Ru = 30 

produced in period 1. 

~ 100 units 

units 

60 units 

units 

Delivery requirement to be 
produced in period 2. 

0 

Rn 4" Rn ^ 
70 units 

0 

Rn — 30 
units 

Delivery requirement to be 
produced in period 3. 

0 

0 

Rn = 40 
units 

Rn — 40 
units 


It can be seen that each of the four output sequences just listed corresponds 
to one of the four possible combinations of periods in which a production setup 
occurs. (2 t “ 1 = 4) Once a particular combination is specified, the j~th plan is 
uniquely determined by the rule that each delivery requirement is to be satis¬ 
fied out of production during the nearest preceding period in which setup costs 
for that part are being incurred. It is not at all self-evident that the only pro¬ 
duction sequences deserving consideration are those indicated by this rule. At a 
later point, however, we shall prove that this is indeed the case, and that no 
reduction in overall costs can be achieved by substituting other output se¬ 
quences in place of these, (see Appendix.) Hence, for the three-period model, 
the four output plans are said to “dominate” all others. Corresponding to these 
alternatives, the period-by-period inputs of labor required to satisfy the de¬ 
livery requirements for part 1—that is, the (3m coefficients are: 


j index. 

l 

2 

3 

4 

xn unknown.. 

Xll 

rcis 

Xlt 

xu 

0iji , period 1 input 

1 

di 4~ 100 h\ = 

cii Hh 30 &i = 

di -j- 60 &i == 

ai 4" 30 b\ = 

coefficients 

100 man-hours 

37 man-hours 

64 man-hours 

37 man-hours 

jSi /2 , period 2 input 
coefficients 

0 

-f 70 6i = 

73 man-hours 

0 

di 4” 30 bi ~ 

37 man-hours 

0i/3, period 3 input 
coefficients 

0 

0 

ai 4- 40 &i = 

46 man-hours 

ai 4- 40 6i = 

46 man-hours 


8 If the programming model distinguishes between T time periods, there will be at most 
2 t — 1 distinct combination of periods within which some production could occur—hence 
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3. Aggregation of individual items 

It has already been emphasized that in order for the model described by 
to be a useful one, the number of individual parts / must be quite 
large in relation to the number of time periods, T. (In the particular machine 
shop, this qualification creates few difficulties. The number of parts required 
for any one finished item would seldom amount to less than 100 distinct pieces.) 
But since the number of equations in the system equals (2 T + I), this also 
means that any conventional simplex computations of (2.1)-(2.5) would in¬ 
volve substantial costs. In general, there are two ways around a difficulty of 
this kind: One might be to construct a computing routine expressly designed to 
exploit the special structure of this linear programming matrix. 9 The other 
would be to aggregate the original model in some suitable way, obtain an opti¬ 
mal linear programming solution to the aggregative model, and then translate 
this solution back into a detailed production plan for each part. The second 
course is the one that will be followed here. 

The aggregation principle that seems most natural for this problem is to say 
tha,t two parts belong to the same production category if they have a similar 
ratio of setup labor to total single-lot labor time, and if they also have a similar 
pattern of delivery requirements. In other words: 

let R it = delivery requirements for part i at end of period t 
di = setup time for part i 

bi = incremental labor required per unit of output of part i 
. Then two P arts O' = 1 and 2, respectively) are said to be in the same produc¬ 
tion category k if and only if there are two factors of proportionality a and X 
such that: 


(3.1) 


Ri 


= X 


(all t) 


at most 2 T 1 “activities” for each parts category i. Furthermore, if the first period’s 
delivery requirements are greater than zero (R a > 0), this upper bound becomes 
liven for T - 6, 2 T 1 - 32, an easily manageable number of activities. Although strict 
ogic compels the enumeration of all such lot-splitting possibilities, in practice it should 
not be difficult to reduce the number substantially by common-sense inspection. 

If he examines the detached coefficients matrix associated with equations (2 2)-(2 4) 
the reader will observe that every basis of rank (2T + I) that can be formed from this 
matrix may be partitioned in the following way: 


A 


i B 


C 


ID 


wiere A is tin identity matrix of rank (T + 7), and D is a square matrix of rank T. The 
matrices B and C are rectangular—the former with {T + I) rows and T columns, the later 
with T rows and (T + I) columns. The numerical difficulties connected with solving a 
( 2 T 6 +I) f SU ° h equatl0ns are much closer to the order of magnitude of T, rather than of 
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and 

(3.2) 


<h 


a 2 


Ql + &1 Ru ®2 + &2 S 


If conditions (3.1) and (3.2) hold, then the labor input coefficients for the 
two parts will be related to one another by a single factor of proportionality— 
a factor equal to the ratio of the two setup cost coefficients: 


(3.3) 


0,2 + Zi? 2 1 

fast _ 02 _ _ * 

Put di ai + 6i X) i2if 


(all j, t) 


In other words, if conditions (3.1) and (3.2) apply, and if the j-th setup se¬ 
quence is an optimal one for part 1, it will also be an optimal one for part 2. 
Hence there is no reason to distinguish between the two parts within a linear 
programming model. All that needs to be done is to adopt one of them (e.g., 
part 1) as a standard unit of measurement, and then to express the aggregate 
requirement for that class of parts in equation (2.2) as: 

a 2 + is y Ru 

(3.4) aggregate requirement = 1+^=1+ ai + 6i X} Ru * 


By following this principle of aggregation, it will ordinarily be possible to 
make a substantial reduction in the number of equations listed in (2.2), and so 
reduce the burden of computations without lessening the inherent accuracy of 
the linear programming model. 

In practice, the aggregation conditions (3.1) and (3.2) do not seem unduly 
stringent. Conditions (3.1) say, e.g., that if at the end of period t, 25% of the 
total requirement for part 1 is to become available for final assembly, then 25% 
of the total for part 2 must also become available at that time. When both parts 
are required for the same end item, the timing of delivery requirements will 
usually be identical, and so there should be no difficulty in constructing a small 
number of groups such that each part within a given class will satisfy conditions 
(3.1). Once this kind of preliminary grouping has been effected, it should be easy 
to define production categories k that also satisfy conditions (3.2)—at least to 
whatever degree of approximation is warranted by the goodness of the original 
estimates of a*, 6*, and R it . Table 1 illustrates this point for the case of one 
typical end item actually produced by our manufacturer—an end item requiring 
1 10 distinct parts, each with the same pattern of delivery requirements. Here the 
quality of the raw data was such that the six-category classification scheme shown 
for these parts in Table 1 appeared entirely satisfactory for purposes of long- 
range production planning. 

In following through the aggregation procedure just described, it seems con¬ 
venient to define the unit of measurement—i.e., the “standard” part in each 
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production category k —as one for which the total of setup time plus single-lot 
running time equals one hour. The requirement for all parts in category k may 
then be expressed in terms of this standard as follows: 

Qi = aggregate number of “standard” hours’ worth of parts in category k 

M -£(«< +&,•£*:„) 

idc t 

To summarize: Given the labor input coefficients a,- and and also the de¬ 
livery requirements R it for each of many distinct parts i, a small number of 
production categories will ordinarily suffice for the purpose of an aggregative 
linear programming model. Furthermore, once an optimal solution has been cal¬ 
culated, there should be no difficulty in translating the aggregative results back 
into a detailed plan for the production of each distinct part. 

What makes such a translation possible? E.g., what if the linear programming 
solution called for 1,000 hours’ worth of parts in a given category to be produced 
in a single lot, and 1,000 hours’ worth by a split-lot plan? It is perfectly true that 
no sense could be made of a detailed plan that called for producing half of every 
distinct part with a single-lot program and half with split lots. But it would 
make perfectly good sense to translate the aggregative solution into a detailed 
plan that called for producing one distinct group of parts according to the single¬ 
lot plan and another group according to the split-lot plan—provided that the 
total “standard” time for parts in each of these two groups came to 1,000 hours 
apiece. The whole trick consists of observing that when the number of distinct 
parts is large, and that when one is dealing with groups of such parts, the alter- 
native production programs are not mutually exclusive, and that under these 
conditions, one can always spell out a meaningful detailed plan for any convex 
combination of the stated alternatives. 

Whenever the number of distinct parts in a production category exceeds more 
than a handful, there should be no serious difficulty in translating the aggre- 


TABLE 1 

A system of aggregation for 110 distinct parts , as classified by setup labor ratio a, 


Production 
category k 

Class interval for the 
setup labor ratio 

Number of 
distinct parts 
within category k 

Maximum number of 
“standard” hours 
required for any single 
part in category k 

Total “standard” hours 
required for all parts in 
category k — Qk 

a; - 

* + bi St Ru 

(all i e k) 

1 

0 s oti < .10 

9 parts 

1,046 man-hours 

4,064 man-hours 

2 

• 10 S cti < .20 

33 parts 

567 man-hours 

4,774 man-hours 

3 

•10 § a< < .30 

32 parts 

176 man-hours 

2,097 man-hours 

4 

.30 g: an < .40 

21 parts 

90 man-hours 

654 man-hours 

5 

• 40 S ot\ < .50 

11 parts 

66 man-hours 

286 man-hours 

6 

■ 50 £ cti < 1.00 

4 parts 

34 man-hours 

98 man-hours 

Total. 


110 parts 


11,973 man-hours 
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gative solution back into a detailed program for the output of each part. 10 
The formulation of the model ensures that not only will the total parts require¬ 
ment be satisfied in terms of “standard” units, but also that the production of 
each of these parts can be time-phased in such a way as to satisfy the initially 
stipulated delivery requirements. 

4. A numerical example 

This illustrative example will refer to a case involving three time periods and 
five production categories. In following through the calculations, the first step is 
to obtain numerical values for the setup time ratios a k and the percentage de¬ 
livery requirements R kt "within each of the five production categories 

k. These parameters, along with the constants Q k , S t , and V t , are all listed in 
Table 2. With this information available, it is then a straightforward matter to 
construct the p k jt labor input coefficients, and then the matrix of detached co¬ 
efficients (Table 3) for the linear programming model indicated abstractly by 
conditions (2.1)-(2.5). n The only change introduced by the aggregation proce¬ 
dure is the replacement of the index i by the index h ranging in value from 1 to K. 
That is, instead of variables which represent the fraction of the requirement 
for the t-th part supplied by the j- th alternative sequence, we now have x k j 
variables which represent the total number of “standard” hours’ worth of parts 
in category fc that are to be produced by the j- th sequence. Along with this 
change, it is, of course, necessary to replace the constants of unity in equations 
(2.2) with the Q k , the total number of “standard” hours’ worth of parts required 
in category k. 

Altogether this system involves 25 unknowns and 11 equations. Of the un¬ 
knowns, 16 are of the x kj type, and there are three each of the l t , s t , and v t 
type. 12 Since the matrix shown in Table 3 indicates non-zero coefficients only, 
the first row of that matrix (numbered 0) contains just three entries—the cost 

10 If the reader insists upon some precision in the definition of a “handful/’ and if he 
is willing to recognize that the cu , b% , and Ru parameters are each a bit fuzzy, I would 
venture the guess that no real translation difficulties will occur if the number of distinct 
parts within a given category k exceeds 10, and if the maximum time required for any 
single part in a given category is less than 20% of the total. On this score, see Table 1. 

11 Except for the position of the decimal point, the that appear in Table 3 are iden¬ 
tical with those calculated on p. 120 above. All other @ k jt were obtained by a 
similar process. 

12 The reader may wonder why only two alternative programs (j = 1 and 2) are listed 
for parts categories 4 and 5. In strict logic, even though no delivery requirements for these 
parts exist during period 1, one should still consider the possibility of producing them 
during that period as well as during 2 and 3. But period 1 production of these items would 
only be profitable if, in an optimal solution, the “shadow price” associated with labor in 
period 1, turned out to be lower than that associated with labor in period 2. Since the a 
'priori considerations were against this outcome, all activities corresponding to positive 
amounts of period 1 output were omitted from the linear programming tableau shown in 
Table 3. As things worked out, the optimal solution substantiated these conjectures, and 
so nothing was lost by discarding the possibility of period 1 output for parts categories 
4 and 5. 
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TABLE 2 

Parameters and constants fo r the numerical example 


Ctk . 

Rkl ^tRkt . 

Rk2 -T- 'ZtRjct . 

RkS %tRkt . 

<?*. 

Time period, t 

Parts category k 

1 

2 

3 

4 

5 

.1 

.3 

.3 

.4 

3,500 

.2 

.3 

.3 

.4 

4,100 

.3 

.3 

.3 

.4 

2,900 

.2 

0 

.4 

.6 

4,800 

.3 

0 

.4 

.6 

3,200 


St 

Vt 



1 

2 

3 


6,000 

6,000 

6,000 

1,500 

1,500 

1,500 



— 


expression STf FW 1 ^^ ° Vertime labor variables m the minimand, 
3 ons f2 2 ‘th! ^ ! ™ and> thC nCXt five rows ^respond to 

E cTmi t'S’ft reqmremen u ts for out P ut in each of the five parts categories. 
Then come the three rows numbered 6, 7, and 8-one for each time period-con- 

RowsTv tndT? ° f kb °H ll M l W ’ ithIn the QUmber of man ~b°urs available. 
1 , 5 d 8 cor i res P° nd before to equation group (2.3). And finally the 

last three rows (numbers 9, 10, and 11) coincide with equations (2.4)-the Jmper 
bovmd conditions upon the use of overtime work in any one time period 

structure of met J od of calculat ion, and taking advantage of the special 

structure of the mate shown in Table 3, it proved to be an easy matter to cal- 

unSue The m!° this ™ d ^^ d to determine that the optimum was 

^ l 1 • £ ? solutlon > alon S wit h the corresponding “shadow prices” 

or dual variables,” is shown in Table 4. According to this solution R pa ys to 

^sfnce°these f Pr ° U< f 10n - Plan for * he out P ut of every part in categories 3 and 5. 

is largest thi<Tn T are tbe 0nes for whicb tlie setu P cost parameter ct k 

argest, this outcome is an entirely reasonable one.) All parts in categorv 2 

r p eM3 P ^T„e i ”l‘T' l0te ~ 6 S? f th " 0Utt "“ " peliod *' the reader 

H . ! m Pm0d 2 ‘ And in the case of both categories 1 and 4 

P 7 . combine two lot-splittmg plans. That is, 1,915 “standard” hours’ 
orth of parts in the first category are to be produced by splitting production 
between tim e periods 1 and 3, and the remaining parts in that categorTby split 

partsTnratr a “° ng aI1 , three time P eriods - Similarly, 1,479 hours’ worth of 
parts m category 4 are to be turned out in a single lot during period 2 and the 

— ^ ° f 3 ’ 321 is b * ^t-ed by splitting p^uctfon bTtt^ 
The dual variables for each equation ( Ul , Ui , • • • , Ull ) measure the potential 

a^XedtehXtT d / 0Ver A time lab K C ° St) ^ Unit Change in the constaQt 

associated with that equation. An extra hour’s worth of straight-time labor avail¬ 
able m period 2, for example, would make it possible to reduce the total amount 








Matrix of detached coefficients. A numerical example of the linear programming model {2.1) {2.5) 


S C- S- w w 

H ~ § § 1 § 1 


sss 
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TABLE 4 


Values of dual variables and of non-zero primal variables in the optimal solution 
a) Non-zero primal variables 


Parts 
category i 

Initial parameters 
and constants 

Values of hi variables, in “standard” hours 

ah 

Qk 

One-lot plans 

Two-lot plans 

three-lot plans 

1 

.1 

3,500 hours 

— 

X U = 1,915 

Xu — 1,585 

2 

.2 

4,100 hours 

— 

#23 — 4,100 

— 

3 

.3 

2,900 hours 

x n = 2,900 

— 

— 

4 

.2 

4,800 hours 

Xu = 1,479 

Xu = 3,321 

— 

5 

.3 

3,200 hours 

Xu = 3,200 

— 





man-hours 



It 

(overtime) 

St 

(slack) 

vt 

(slack) 



L 

1,500 

_ 

_ 



2 

992 

— 

508 


3 

— 

— 

1,500 

Minimand = Xl t = 

2,492 




b) Dual variables , change in minimand per unit change in value of the constant associated 
with the particular equation. 

Output requirement equations (2.2) 

ui = 1.202 overtime hours/"standard” hour’s worth of parts, category 1 

ut = 1.299 overtime hours/'standard” hour’s worth of parts, category 2 

u% = 1.370 overtime hours/''standard” hour’s worth of parts, category 3 

u± = 1.000 overtime hours/"standard” hour’s worth of parts, category 4 

us = 1.000 overtime hours/"standard” hour’s worth of parts, category 5 


Labor availability equations (2.3) 

ut = —1.370 overtime hours/hour’s worth of straight-time labor in time period 1 
Ul = —1.000 overtime hours/hour’s worth of straight-time labor in time period 2 
ut = — .706 overtime hours/hour’s worth of straight-time labor in time period 3 


Overtime limitation equations (2.4) 

u 9 = —.370 overtime hours/hour’s worth of overtime labor in time period 1 

Uiq = 0 overtime hours/hour’s worth of overtime labor in time period 2 

Un = 0 overtime hours/hour’s worth of overtime labor in time period 3 


of overtime by exactly one hour. Hence u 7 = —1. But an extra hour available 
in period 1 could be employed so as to avoid a substantial amount of lot-splitting, 
and for this reason = —1.370. 13 Such values are immediately suggestive of 

13 Although it will not necessarily always be true that ut ^ u 7 5* u& 0, this ranking will 
hold whenever: (a) inventory costs are negligible, and (b) the output sequences are defined 
so that production of each item is permitted in any of the time periods prior to delivery. 
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“break-even” points for the worth of additional labor in the machine shop be¬ 
yond the amounts already assumed available. Similarly the dual variables ui , 

• * • , u h —those associated with the output requirement equations (l)-(5)— 
are indicative of the incremental worth of any external supply of parts in each 
of these five categories. 

One theorem about the properties of the first five dual variables will be asserted 
without proof: If all parts within two categories satisfy condition (3.1) but not 
(3.2), and if the first category's setup time parameter is lower than that of the 
second, then the “implicit cost” of meeting an additional hour's worth of output 
requirements in the first category will be no higher than that of an hour's worth 
in the second category. Hence 0 S ^ u% S u z . Also 0 ^ S . 

If one were concerned purely with the formal aspects of this economic lot size 
problem, the discussion of the numerical example could end at this point. Given 
the optimizing criterion and the constraints listed in (2.1)-(2.5), an optimal 
solution has been produced for the aggregative scheduling problem, and in prin¬ 
ciple it has been shown how this could be translated back into a detailed time- 
phased plan for the output of each distinct part. But if one's interest is with the 
actual managerial problem that is represented by this model, something more 
needs to be said. In our idealization of the machine shop's activities, all interac¬ 
tions have been neglected between the machine shop and the final assembly 
area. In particular, we have ignored the possibility that by splitting parts pro¬ 
duction within the machine shop, we may disrupt the smooth flow of final assem¬ 
bly work on any one series of end items. To the extent that this intermittent 
pattern of final assembly costs more than a continuous flow of work, the “sub- 
optimization” calculated for the machine shop is a misleading one. This does 
not mean that the linear programming analysis is useless—only that the results 
of this analysis have to be integrated with what is also known about the final 
assembly operation. 

Here, for example, one of the men actually responsible for production plan¬ 
ning suggested that it might be possible to transfer skilled final assembly ma¬ 
chinists from their usual jobs, and to bring them temporarily into the machine 
shop to help meet the initial period's peak demand there. Since this proposal 
would make it possible to avoid all lot-splitting, it contains several attractive 
features—not only the obvious reduction in setup costs, 14 but also the very real 
benefits to be derived in the final assembly area by having 100% of every part 
available at the time that final assembly is initiated for any one series of end 
items. Against both of these prospective benefits, it is, of course, also necessary 
to evaluate the immediate cost of disrupting final assembly activities by such a 
temporary transfer. The linear programming analysis of the machine shop can¬ 
not by itself indicate that such transfers would be in the best interests of the 
plant as a whole, but it can at least indicate the order of magnitude of the direct 

14 If manpower were available early enough to make single-lot production possible for 
all parts, the actual time of 20,492 man-hours (2 S t 4- 2fc) could be reduced to the “stand¬ 
ard” time of 18,500 hours (Q x + Qt + Qt + Q* + The excess labor requirement for 
split-lot production amounts therefore to 1,992 man-hours. 
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in , Single ' l0t f° duction of a11 P arts - Surely this calculation 
is not the only thing relevant to the question of whether workers ought to be 

ransferred temporarily, but it does represent one of the pieces of information 
needed in order to arrive at a sound decision. 

5. A theorem on the occurrence of fractional values for the 

x iS variables 

bilttTf Tv, rli i r P ° int ’ i% WaS convenient t0 assert without proof that the applica¬ 
bility of the linear programming proposal did not depend upon the possibility 

aggrega mg distinct parts into the output categories defined by (3 1) and (3 2) 
but only upon the existence of a large number of distinct parts i-each of them 
with a small labor input requirement by comparison with the total availability of 

a -h j u" mTw f °f m ° f thlS assertion is as follows: Consider the model de¬ 
scribed by 2.1 -(2.5). Then if there are / parts and 7 time periods, in tZ 
basic feasible solution there will be at least (7 - 7) parts for which exactly one 
Xy variable is operated at a positive level. Thus, except for at most 7 parts the 
mear programming solution will immediately indicate a detailed feasible time- 
phased plan for the output of each item in the machine shop. For each of these 
T parts there is indeed the possibility that the linear programming solution will 
reqrnre half the lot to be produced according to a one-lot plan and half according 

m • +1 P an ',^ e Pkysfo^ absurdity of such a solution is obvious, but if 7 
is suffidentty small m relation to 7, the few parts that will be affected should 

stTte thf, it 7 + ° m ^ Vi 7 P ° int ° f lon S- ran S e Planning. Another way to 
state this result is to say that when the number of parts to be scheduled far ex- 

when one “Tt^ ° f , indlvldual time periods (a reasonable enough assumption 
*hen one end item alone may contain 110 distinct components), the very multi- 

with L°up P costs aCtS m SUCh a Way aS t0 Sm ° 0th ° Ut thC “ lum P iness ” associated 

The preceding theorem may be restated as follows: 

hvmwnw b r iC feasible linear programming solution to the model indicated 

wt f 6 m parts for Which exactI y one variable appears at a 

positive intensity m that solution, then m A 7 - f. P 

i ^ i 6t U represent the number of foe variables l, and v, operated at positive 

satisfied the > a ? 1 i C ^f baS1C f6aSible solution - ( In ord er for equations (2.4) to be 
satisfied, n A 7.) The expression (7 - m) represents the number of parts for 

S Lr ? r m °M * W Variables are aerated at a positive intensity in the par- 
salutlon - N° w smce there are altogether (27 + 7) restraint equations 

in the‘”oluar2 : “ m<>St (2T + 0 ' ,ariaWeS appear at 1 t” 8 ®™ level 

(5.2) 

and since n ^ 7 

(5.3) 

(5.4) 

which was to be proved. 


27 + 7 A n + m -f- 2(7 — m) 

7 + 7 ^ m + 2(7 — m) 
:.m ^ 7 — 7, 


V-17—PROGRAMMING OF ECONOMIC LOT SIZES 


235 


6. Summary 

This paper may be recapitulated as follows: Starting with a production 
scheduling problem that involves indivisibilities in the form of setup costs, a 
linear programming model has been constructed that is not identical with the 
original problem, but which provides an excellent approximation when the num¬ 
ber of distinct parts is large in comparison with the number of time periods, T. 
In this approximation to the original problem, the variables do not refer to the 
size of each production lot within each time period, but rather to the fraction 
of the total requirement for any given part that is satisfied by a particular se¬ 
quence of production for that part. The linear programming formulation en¬ 
sures that, except for at most T individual parts, these fractions will all turn 
out to be either zero or one. With this exception, therefore any “basic feasible 5 ’ 
solution will automatically avoid the possibility of meeting one portion of the 
requirement for a given part by a one-lot program of output and another por¬ 
tion of the requirements with a split-lot program. Although this physically 
absurd option is built into the model, a theorem ensures that the option will be 
exercised only rarely. 

How serious a distortion of reality is implied by a linear programming solu¬ 
tion that calls for the production of a few parts in this physically absurd man¬ 
ner? From a purely abstract standpoint, such a solution is completely infeasible, 
and it is easy to construct numerical examples for which the linear program¬ 
ming solution could not be “patched up 55 without a large increase in the total 
system costs. Despite this perfectly valid formal objection, it may seriously 
be doubted that this difficulty really detracts from the usefulness of the model. 
The detailed optimal solution to such a model is hardly intended as a literal 
forecast of production activities up to eighteen months in the future, but only 
as a guide to making a number of immediate decisions that will affect the fu¬ 
ture—overtime, recruiting and training of new personnel, and outside procure¬ 
ment of certain parts. For the purpose of choosing among these broad alterna¬ 
tives—although not for the detailed short-run scheduling problem—the few 
apparent infeasibilities should be of minor significance. 

This same fine of reasoning should do much to dispel another kind of objec¬ 
tion that may be raised against the model presented here. The usefulness of 
this proposal depends upon the magnitude of the number of distinct 
time periods, T. Since the number of alternate production activities to be enu¬ 
merated for a single part category is of the order of 2 r , the capacity of current 
computing equipment would not be taxed by a model with T sS 8, but would 
clearly be swamped for T ^ 15. 15 Certainly there is no guarantee that it will 
always be satisfactory to plan production over an 18-month period in time 
units as large as one to three months. Indeed a determined critic would be 
within his rights in pointing out that it might be necessary to plan a single 

15 Even for large values of T , it would still be possible to enumerate just a small num¬ 
ber of the more interesting alternative production programs for any one part. A solution 
based upon such an incomplete enumeration would still be a feasible one, and should be 
near-optimal—even though no a 'priori guarantees can be made as to its optimality. 
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year’s operation ahead in terms of 365 individual time units—each one day in 
length. The answer to such a critic can only come from a study of the empirical 
problem to which the model is to be applied. Assuming that the purpose of the 
model is to aid in answering certain broad questions dealing with overtime, 
outside procurement, etc., it should not be a serious limitation upon the prob- 
!em formulator for him to keep the value of T well within the limits of present- 
day computing feasibility. 


7. Significance of the results 

The production scheduling example discussed in this paper is by no means an 
isolated instance in which, starting with a problem that entailed indivisibilities 
in terms of one set of variables, it was nevertheless possible to redefine the 
variables so as to transform the original problem into a new one that could be 
studied from the computational viewpoint of linear programming. This samo 
approach has already been illustrated in the newsprint trim problem [5], in the 
coat-and-pants problem [3], in Salveson’s machine loading problem [6, pp. 234r- 
245], and doubtless in others. There appears to be an entire class of optimiza¬ 
tion problems that involve indivisibilities in terms of one set of variables, but 
w ch can nevertheless be translated into the linear programming format. Some 
precise characterization of this class of problems seems to be needed but is 
lacking at present. ’ 

Although the economist’s primary interest is not in numerical analysis, but 
rather in the possibility of market analogue solutions to welfare maximization 
problems, the indivisibility of setup costs places him in an awkward position. 
As long as he regards the individual “activity” as one of determining the lot 
aze for a given part m a particular time period, there need be no set of intra¬ 
firm shadow prices that is compatible with a cost-minimizing equilibrium and 
ence no possibility of a market analogue solution. The curious aspect of the 
production problem outlined here is that it is possible to redefine activities and 
commodities so as to end up with a linear programming system—i.e., one for 
w ch, m principle, a market analogue solution is possible. From the viewpoint 
of the theory of market decentralization, the chief feature of this alternative 
version is that the individual activities represent a greater degree of vertical 
miration than is assumed in the initial statement of the problem. Para¬ 
doxically enough, successful decentralization requires that the manager of each 
activity ha/ye a longer “span of control” than the size of the individual lot in a 
particular time period. It is necessary for each such manager to be familiar with 

® en . tl f re Pf ogra ” a °* labor in P uts that is implied by his particular sequence of 
output for the individual part. 




“Dominance” properties of the set of alternative production programs 
for a given item 

Sh ° P iS engaged in Producing a number of items with a resource 
input that is homogeneous except for date. If an item is produced in the *-th 
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time period (2 = 1, 2, * * * T), resource inputs are required from the total avail¬ 
able in that period, but from no other. The amount of resources used in the 2-th 
period by producing x t units of a particular item is given by: 

(1) ad t + bx t 

where 


>0 \ . r lN 

xA I implies 8 A 

\=o/ \=0> 


The non-negative constant a is said to represent the “setup cost” for that item 
and the non-negative constant b the “incremental unit cost.” Since 8 t = 0, 1, 
there are altogether 2 r column vectors of the following form: 


( 2 ) 


V 


Now suppose that the firm is to deliver R t units of the item in the 2-th period. 
Corresponding to each of the 2 T vectors Ay , the time phased production vector 
Xy may be written, where: 


(3) 


Xji 

X j2 



l VjTj 


and where the output levels Xj t are determined according to either (4), (5), or 
(6). These conditions are equivalent to the rule that each delivery requirement 
be satisfied out of production during the nearest preceding period in which setup 
costs are being incurred: 


(4) if dj t = 0, then x jt = 0 

(5) if 8 jt = = 1, then x jt = R t 

(6) if 8j t = 1, 8 jti+1 = 0, and if 7 is the largest integer such that 
*y.«+Y = 0 for 7 = 1, • • •, 7, then x it = £?-o Rt+ y - 


The setup plan Ay and the corresponding output plan Xy are said to be “fea¬ 
sible” from the viewpoint of delivery requirements if the components of Xy also 
satisfy: 

t t 

22 X jT ^ 22 Rt 

T=1 T—1 


(7a) 


« = 1, 2, • • •, T - 1) 
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and 

(7b) 


T 

T== 1 


T 

Eft 

T=1 


For each of the “feasible” Ay and X,- vectors, the resource input column vector 
/3y may be defined as follows: 

(8) Pi = oA, + bX y (j = 1, • ■ • , J) 

The T X J matrix B is composed of the vectors /3y : 

(9) B = (ft, ft, ■ • •, ^, • • •, ft) 

Now let the “implicit value” or “shadow price!’ of any resources used in the 
J-th time period be represented by u t -(u t g 0; ^ u t ^). The column vector 

formed from these components is termed U: 


( 10 ) 


U = 


Ui 

ih 

u t 

U F 


(11)-(13): 

/>0\ 

(11) 

Xrt I 1 if and only if 

\=o/ 

(12a) 

t t 

X %TT £~ ^ V R T 

T*=l T=1 

T T 

(12b) 

and 

jtmj *^rr ^ ^ 2 Rt 

T==l T=1 

(13) 

i* 

IV 

o 


v=0 J 


(t = 1, 2, • - -, T — 1) 


(all t) 

Denote by ft the vector of resource inputs that is required in order to carry 
out this production plan: 

( 14 ) ft = aA r + bX r 

Dominance” theorem: If the vector U S 0, there is no pair of vectors A, 
and X r satisfying conditions (11)-(13) for which it is also true that: 

(15 > U%<U%£ 0 (all ft- e B) 

In words, this theorem says that if the production program X T is feasible from 
the viewpoint of delivery requirements, then there will always be at least one 
program Xj within the previously enumerated set that has an implicit cost at 
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least as low as that for program X r . This is the sense in which the set of resource 
input vectors B is said to “dominate” all others. 
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ON A DYNAMIC PROGRAMMING APPROACH TO 
THE CATERER PROBLEM—I* 

RICHARD BELLMAN 
The RAND Corporation 
Summary 

In this paper, it is shown that the “caterer” problem, a problem in mathe¬ 
matical economics and logistics which has been discussed by Jacobs, Gaddum, 
Hoffman and Sokolowsky, and Prager, can be reduced to the problem of deter¬ 
mining the ma x i mum of the linear form L n = X”«i , subject to a series of 
constraints of the form Vi g , Vi + v 2 g b 2 , Vi + v 2 + v% ^ b 3 , • 

Vi + v t + • • • + v k g 6* , v 2 + v 3 + ■ ■ - -f v k +i g b k+1 , • • • , -f • • • + v n d 
b n , 0 ^ Vi ^ n , i = 1, 2, • • • , n, under an assumption concerning the non¬ 
accumulation of dirty laundry. 

This m a x i m ization problem is solved explicitly, using the functional equation 
technique of dynamic programming. 

I. Introduction 

The purpose of this paper is to show how the functional equation method of 
dynamic programming may be used to obtain an explicit solution of the problem 
of determining the maximum of the linear form 

(1) L„(V ) = Vl + Vi + • • • + Vn , 
over all subject to the constraints 

(2) (a) Ti ^ vt ^ 0 

(b) vi S fei 

" 4 " ^2 = &2 


V\ + V2 + * * * + Vk ^ bk 

Vi + v z + * • • + Vk +1 ^ bk+i 


Vn-k+1 + * * * + V n S bn . 

The origin of this problem lies in the “caterer” problem, a problem of some 
interest in recent years in connection with economic, industrial and military 
scheduling problems. 

2. Discussion 

A large number of mathematical models of economic activities culminate in 
the problem of determining the maximum or minimum of a linear function sub- 

* Received October, 1956. 
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ject to a set of linear constraints. The importance of having available computa¬ 
tional algorithms for the numerical resolution of these problems can hardly be 
over-estimated, both as far as application of the results are concerned, and as 
far as further theoretical study is concerned. Foremost of these algorithms is 
the “simplex” method of Dantzig, together with its modifications by Chames, 
Lemke, Beale, and others. 

In the study of universal methods, insufficient attention has been paid to the 
underlying structure of the processes generating the minimization and maximiza¬ 
tion problems. Ideally what is desired is a systematic fitting to each process of 
a computational algorithm specifically designed for the process. There has been 
barely a start made in the mathematical theory of computational algorithms; 
cf. the discussion in [1]. In particular, little effort has been devoted to the ques¬ 
tion of analytic solution of these minimization and maximization problems. 

In this paper we wish to consider the interesting minimization problem posed 
above using the functional equation approach of dynamic program m i n g, [2]. 
The problem from which it is derived, the “caterer” problem has been discussed 
by a number of mathematicians over the last few years, see Jacobs, [3], Gaddum, 
Hoffman and Sokolowsky, [4], and Prager, [5]. 

Our interest in the possibility of an explicit solution of the type we present 
here was aroused by the solution obtained by O. Gross in the case where k = 2. 

3. The Caterer Problem 

Let us now state the caterer problem in the following form: (cf. Jacobs, [3], 
Prager, [5]) 

“A caterer knows that in connection with the meals he has arranged to serve 
during the next n days, he will need r,- fresh napkins on the jth day, j = 1,2, 
• * • , n. There are two types of laundry service available. One type requires p 
days and costs b cents per napkin; a faster service requires q days, q < p, but 
costs c cents per napkin, c > b. Beginning with no usable napkins on hand or 
in the laundry, the caterer meets the demands by purchasing napkins at a cents 
per napkin. How does the caterer purchase and launder napkins so as to mini¬ 
mize the total cost for n days?” 

As is known from the above references, and also J. W. Caddum, A. J. Hoff¬ 
man and D. Sokolowsky, [4], this problem can be resolved by linear program¬ 
ming techniques in some cases. 

In this paper we shall approach the problem using the approach of dynamic 
programming. 


4. Dynamic Programming Approach—I 

The first approach to the problem by means of dynamic programming pro¬ 
ceeds as follows. The state of the process at any time may be specified by the 
stage, i.e. day, and by the number of napkins due back from the laundry in 1, 
2, up to p days hence. On the basis of that information, we must make a deci¬ 
sion as to how many napkins to purchase, and how to launder the accumulated 
dirty napkins. 
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It is not difficult to formulate the problem in this way, using the functional 
equation approach. Unfortunately, if p is large, we founder on the shoals of 
dimensionality. 

As we shah see, the proper dimensionality of the problem is p - o, when 
formulated m a different manner. 

5. Formulation of Problem 

In place of this approach, let us proceed with the equations defining the 
process m the usual way until an appropriate point at which we shall reintroduce 
tne dynamic programming approach. 

. ^™ firs * of clear from the above formulation of the problem that we may 
just as wefi purchase all the napkins at one time at the start of the process Let 
us then begin by solving the simpler problem of determining the laundering 
process to employ given an initial stock of 5 napkins. Clearly 

(1) S ^ max r k . 

k 

Let us now make a simplifying assumption that all the dirty napkins returned 
at the end of each day are'sent out to the laundry, either to the fast service or 
to the slow service. There are many justifications for this assumption as far as 
apphcations are concerned, which we shall not enter into at the moment 

The process then continues as follows. At the end of the k ,h day, the caterer 
drndes r k , the quantity of dirty napkins on hand, into two parts, n = u k + v k 
with sent to the g-day laundry and v k sent to the p-day laundry. 

^ W Y , 1 We ,^ e that the entity. x k , of clean napkins avail- 
relit^ gUUUnS ^ 18 determined ^ the following recurrence 


( 2 ) 


% = S, 


Xk ~ (**-1 ~ r k - 1) + u k _ q -f- v k _ P 
where u k = v k = 0 for k g 0. 

The cost incurred on the k‘ h day is 

bv k + cu k 

Hence the total cost is 

^ v k + c u k . 

' ‘ *=i 


k = 1, 2, • • • , AT — 1 


The problem is to minimize C N subject to the constraints on the u k 

^ ^ r * > k = 1, 2, • • • , N. 

In order to illustrate the method, we shall consider two particular cases. 

^ a - g=l, p = 2 

b - 5=1, p = 3 
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The general case will be discussed following this. 

6. The case q — 1, p = 2 
The equations in (5.2) assume the form 
X! = S 

x 2 = (xi — ri) + % 

Xz = (x 2 — ^ 2 ) + ^2 4" Vi 

( 1 ) 


&n-l = (Xn-2 ~ 7*»-2) + ^n-2 + ^n-3 > 

= (x»-i — r„_i) + ^n~i + v n - 2 

Let us now solve for the Xk in terms of the Uk and Vk . Namely 
xi = S 

x 2 = (S — r±) + Ui 

Xz = ($ — 7 i — 7*2) + (^1 + ^2) + ^1 

X* = (S — n — r 2 — rz ) + (^1 + u 2 + u z ) + v± + v 2 , 

(2) • 

= ($ — 7*i — 7*2 — * * ' — TV— 2 ) + (Ui + U 2 + Uz + * * * + Un- 2 ) 

(Vi + t>2 + ’ * * + *>n-3) 

2V = ($ — 7*i — 7*2 — * * * “ 7* n _i) + (^1 + U 2 + Uz + * * * + 7/n-l) 

+ {V\ + V 2 + * * • + V n — 2 ). 

Since r k = u k + v k , this may be written 

(3) Xk = S - Vk -1 , (vo = 0), k = 1, 2, * * * , n. 

Turning to (5.4), we wish to minimize 

n-i y—i 

(4) == c 7*jfe 4“ (b c) X) v*, 

Jfc=l *=1 

over all v k subject to the constraints 

(5) (a) 0 <.VkSr k 

(b) S — Vk -1 ^ r k or S — r k ^ v h -i . 

Since (c — 5) > 0, we wish to choose v k as large as possible. Hence 

(6) v k = min (r h , S — n+i), * = 1, 2, • • • , N — 1. 

This determines the structure of the optimal policy. Using this explicit form 
of the solution it is not difficult to determine the mini m izing value of S. 
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7. The case q = K, p = K+ l 

It is readily seen upon writing down the equations that the case q = K, 
p = K + 1 leads to a system of equations of the same type as given above for 
q = 1, p = 2. This illustrates the fact that it is only the difference p — q which 
determines the level of difficulty of the problem. 

8. The Case q = 1, p = 3 

In order to illustrate the method which is applicable to the general case, let 
us consider the case q = 1, p = 3. ’ 

The equations in (4.2) assume the form 

a* = S, 

x 2 = X\ — 7 *i + U\ , 

( 1 ) x % = x 2 — r 2 + u 2 , 

X4 = Xz — r% + u% + Vi 


Thus 


x n — X n —i — r n -l + Un- 1 + y n -3 


Xi = S, 


x 2 — S — + u±, 

Xz — (S ri r 2 ) + + u 2 , 

2^4 = (5 fl 7*2 7 * 3 ) + + u 2 -f- Uz + V\ 


x n — (S — n — r 2 — r 3 — • • • — r n _i) + + u 2 + u 3 + • • • + u n -i 


Hence 


+ Vi + V 2 + ■ • ■ + V n -3 . 


(3) 


X! = S 
Xi = S — i>x 
Xz = S — Vi — v 2 
Xi = S — v 2 — Vz 


Xn = S — a„_2 — Vn~l . 

We wish to maximize ^,h=i Vk subject to the constraints 
S - v y ^ n , S-nlr, 

(4) <S — Vi — i> 2 ^ r 2 or <S — r 2 ^ + y 2 


*3 — Wn -2 — l>n_l ^ r n -i 


& r n-l = Un-2 + Vn —1 
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and 

( 5 ) 0 S :§ ri 

9 . Dynamic Programming Formulation—II 
Our problem reduces to that of maximizing the linear form 

(!) L N ~J2v k , 

k=>l 

subject to a set of constraints of the form 

h ^ Vi , 

(2) (a) s= + » 2 , (b) Tk Vk ^ 0. 

by ^ Vy + V/f-i 

Having chosen V\, it is clear that we have a problem of precisely the same type 
remaining for the other variables v 2 , v z , ■ ■ ■ , vy . Let us then define the sequence 
of functions {/*(*)}, k = 1, 2, • • • , N — 1, as follows: 

N 

(3) f k (x) — max v,, 

«k i-* 

where iE* is the region defined by 
x *jjt vi, ^ 0, 

(a) bk+i $£ Vk + Vk+i, (b) Tk+i Vk +1 ^ 0 

by § Vy -1 + Vy , Ty Vy ^ 0. 

fy-i(x) = Max [u w _i + » w ] 

X £ Vy -1 ^ 0, 

by Vy—i + Vy , Ty j£| Vy ei 0. 

fy~i(x) = Min [by , x + r N ). 

Employing the principle of optimality, [2], we see that 
(8) fk(x) = Max [v k + fk+i (Min (r* +i , b k +i — y*))] 

®S v kS v k* 


(4) 

We have 

(5) 

where 

( 6 ) 

Ifence 

(7) 


where 


v* = Min [a;, 6*+i], for k — 1, 2, • ■ • , N — 1. 
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10. Explicit Solution 

Let us assume that each f k (x) has the form 

(1) /*(*) = Min[P k ,x + Q k ], 

for A = 1, 2, • • • , N — 1. This is true for k = N — 1 uDon referring tn (Q 7\ 
and we shall establish it inductively for general k. ’ 8 ’ ’ 

Assuming the relation true for k + 1, substitute in (9.8), obtaining 
Mx) = 'Mg { Vk + Min [P t+1 , Min (r k+1 , b k+1 - v k ) + Q* +I ]} 


°£*k£v k * 

' {Mn [Pk+k + > Min (r k+1 , b k+1 


V k) + Q k+ 1 + v k ]\ 


{Min [A+1 + Vk > Min ^+1 + Qk+1 + v k , b k+ i + Q i+1 )]} 
(2) = Min [P k+1 + Vk *, r k+1 + Q k+1 + Vk * , h+1 + Qk+i] 

= Min [P k+1 + Min (*, b k+1 ), n+1 + Q k+1 + Min (*, b k+1 ), b k+1 + Q k+1 ] 
= Min [x + P k+1 , P k+l + i k+11 x + n+i + q k+i; bk+i + 

= Mm [x + Mm (P k+1 , n+l + Q k+1 ) > Min (P k+l + b k+1 , Q k+1 + b k+1 )]. 
Hence we have the recurrence relation 


(3) 


k Min {Pk+i + b k+ i , Qk+i + bk+i), P N __ X = b N , 
Qk = Mm + Qx+i), = r N , 


for k = 1, 2, • • • , N - 1. 

• ?“ Se TeC Tu n( :t reI f i0nS determine /*(*)• Furthermore, the optimal policy 
is determmed by the relation F y 


(4) 

at each stage. 


v k = Min' [x, 6* +1 ], 


11. Explicit Solution for the General System 

Let us now show that the same method may be used to solve the general 
maximization problem stated in §1. general 

Define 


(D fk&i, x 2 , • • •, x k - x ) = Max [v k + v k + x + • * • + v n ], 

where R k is defined, for k = 1,2, • • • , n — K, by the inequalities 
Xl ^ v k 

X^v k + v k+1 (b) r . ^ v . ^ 0) 

: i = k + 1, • • • , ». 

(2) (a) xt-i S v k + v k+1 + • • • + v k+K _i 

h+K-i ^ v k + v k+1 + ... + Vi+M 
b k+K ^ v k+ i + v k +i + • • • + v k+K 


bn = Vn-x+l + • • • + Vn 
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We assume that n g: iT. 

Let us first compute f n . K ( Xl , x 2 , • • • , x r _x). This is the maximum of 

0) V n -.x+l + * ’ * + V n 

subject to the constraints 

*1 = Vn—K+l 

(4) (a) x 2 ^ Vn —JST-f-1 4~ ^n—(b) r^^o, 

Xk ~ 1 = y n-ir + i + ... + z; n _i i = 71 — + 2 , * * • , n. 

—iC+l-f-* * •— 1 +^n 

Thus 

(5) fn— k(X\ j 2^2 3 * * * j Xx_i) 

Min (b n , x*-i + r n , x K _ % + r» + r n -i, • • - , a* + r n _ K+2 + - 1 - r „) . 

The recurrence relation for the sequence is 

( 6 ) fk—l(x i 7 X<l } • • • , Xjc—l) 

“osn?S? , + /*[Min(a; s - v k , n), x s -v k , • • •, x X -i — Vk, bk+K-i — %-i]}, 

where 

^ Vk — 1 ~ Min [xi , x 2 , * * • , x K ^i , bk+K- 2 ] 

Let us now assume that f k has the form 

( 8 ) fk(x 1 ,X 2 , , Xk-i) = Min[P 0 * , Xi + P 1|Jt , x 2 + P 2|i , • • • , a*_i + P x _ liJt ]. 

Then, substituting in ( 6 ), 

(9) fk—l(Xi , X 2 , • • • , Xs—l) 

= o s .?fe {Vk ~ l + Min[Po * ’ , r k+1 ) + P u , 

“ ~ fc -1 

x 3 “ ^1 + P 2> * , • • • , h+K-2 — V *_1 + Pk-1,Jc\ } 

= 0 <^- t Min + y *-i j ^2 + Pu, Vk -1 + rjfc + i + P u , 

X * 4” P 2 ,* , • • • , bk+K-2 + Pk- 1 ,*]]. 

The maximum is clearly assumed at v k -i = vt-i. 

Hence we have 

(10) fk—l(xi J X 2 J • * * , Xk— 1 ) 

— Min[P 0 ,fc + Vk—i, x 2 4* Pi,* , £>*—1 4* r*+i 

4“ Pl.t, X 3 4“ P2,k j * • * , 4 - P*_ lfA J 

= Min{Min[P 0 ,* , Pi,* 4- n] + Minfxi, x 2 , * • * , x K ^ x , &jfc+jr_ 2 ], 

X 2 + Pi,* , X 3 + P 2 ,* , * ' * , &*+•*;—2 4* Pjs :—1 ,*} 

Min{xi + % , £2 4- Wk , * • * , x*-! 4- Wk , bk+K -2 4- Wk , x 2 4- Pi,* , 

^3 + P 2 ,* , * * • , bk+K -2 4“ Pjc- 1 ,*}, 
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where 

( n ) w* = Min[P 0 ,jfe, Pu + r*]. 

Hence 

(12) , %2 , * * * , %x—i) = Min{a;i + Wk , x% + Min[w fc , Pi,k], 

x z + Min [w k , P 2 ,k] f * • • Minf6j fc+jS :_ 1 + w k , &* +jr _ 2 + P^i^]}. 

From this equation we can read off the recurrence relations connecting the 
Pi,k and the P itk ~ 1 . 

12. Discussion 

The solution presented in the preceding section yields the optimal policy at 
each stage, as well as the value of the minimum cost. - 
There are a number of related problems which can be treated by similar meth- 
ods. A particularly interesting one is the case where the demand is periodic. In 
this case, the problem reduces to maximizing 

(1) L n = 23 Vi , 

*- i 

subject to a series of constraints 

^1 + ^ 2 + * * * + v K ^ bi 
V 2 + #3 + * * * + Vk+1 ^ b 2 

( a ) : (b) 0 g Vi g n . 

(2) Vn-K+l + * * * + v n fen-iT-l 
Vn-K+2 + * * • + Vi S bn-K 

V n + Vi + * * * + V K -.i ^ b n . 

Furthermore, there are the interesting problems in which there is a storage 
cost for each excess item, and in which there are more than two types of laundry 
service. 

It is also easy to see that several more general classes of maximization prob¬ 
lems subject to linear constraints may be treated by means of the same tech- 
mque. We shall discuss these topics, together with the question of actual com¬ 
putational solution, in a further paper. 
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ON THE CATERER PROBLEM* 


WILLIAM PRAGER 
Brown University 

1. Introduction 

The Caterer Problem was formulated by Jacobs [l] 1 as a paraphrase of a prac¬ 
tical problem concerning the number of spare engines required to assure given 
operational levels of a fleet of airplanes. Jacobs stated the problem as follows. 

A caterer knows that in connection with the meals he has arranged to serve 
during the next n days, he will need r d (> 0) fresh napkins on the jth 
day (j = 1, 2, • * • , n). Laundering normally takes p days; that is, a soiled nap¬ 
kin sent for laundering immediately after use on thejth day is returned in time 
to be used again on the (j + p) th day. However, the laundry also has a higher- 
cost service which returns the napkins in q < p days (p and q being integers). 
Having no usable napkins on hand or in the laundry, the caterer will meet his 
early requirements by purchasing napkins at a cents each. Laundering costs b 
and c cents a napkin for the normal and high-cost services, respectively, where 
b <c < a. How does the caterer arrange matters to meet his needs and minimize 
his outlay for the n days? 

Formulating this as a problem in linear programming and considering the 
case q = p — 1 , Jacobs showed how the problem could be simplified by several 
transformations to such a degree that an explicit solution could be given. To 
this writer, Jacobs’ analysis appears as a mathematical tour de force, which fails, 
however, to shed light on the features of the problem that make the explicit 
solution possible. A similar feeling was recently expressed by Hoffman [2]. 

In the present paper, the Caterer Problem is shown to be equivalent to a Hitch¬ 
cock Distribution Problem [3] with a very special cost matrix. For the case 
q = p — 1, a simple procedure taking advantage of this fact is developed and 
shown to yield Jacobs’ solution. The possible extension of the procedure to the 
case p — q > 1 is illustrated by a numerical example. 

2. The Caterer Problem as a Special Distribution Problem 

It is readily seen that the Caterer Problem can be formulated as a Distribution 
Problem in which the store and each day’s hamper of soiled napkins are the 
origins and each day’s requirement of fresh napkins and the final inventory of 
soiled napkins are the destinations. The cost of “shipping” a napkin from the 
store to any one of the n days is a. The cost of shipping a napkin from the jth 
day’s hamper of soiled napkins to the kth day is b (when k — j ^ p) or c (when 
q <£ k — j < p); when k — j < q, this cost must be considered as infinite to 

* The results presented in this paper were obtained in the course of research sponsored 
by the International Business Machines Corporation of New York City. 

1 Numbers in square brackets refer to the Bibliography at the end of the paper. 
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where 

( 13 -) Wk = MinfPo,*, Pu + r k \. 

Hence 

(12) fk-i(xi , x 2 , • * • , Zjc-i) = Minfxi + Wk , x 2 + Minfe , Pi,h\, 

x z + Min [w h , P 2 A * * * Mm[b k+ K-i + w k , 6* +Jr _ 2 + 
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i=l 

subject to a series of constraints 

vi + v 2 + • • * + v K g 
V 2 + *>3 + * * * + V K +l S. b 2 

( a ) : (b) 0 ^ Vi g ri . 

(^) Vn-K +1 + • * * + V n ^ bn-K-l 

Vn-K+2 + * * * + Vl ^ b n —K 


V n + V\ + * * * + Vx—l ^ b n . 

Furthermore, there are the interesting problems in which there is a storage 
cost for each excess item, and in which there are more than two types of laundry 
service. 

It is also easy to see that several more general classes of maximization prob¬ 
lems subject to linear constraints may be treated by means of the same tech¬ 
nique. We shall discuss these topics, together with the question of actual com¬ 
putational solution, in a further paper. 
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1. Introduction 

The Caterer Problem was formulated by Jacobs [l] 1 as a paraphrase of a prac¬ 
tical problem concerning the number of spare engines required to assure given 
operational levels of a fleet of airplanes. Jacobs stated the problem as follows. 

A caterer knows that in connection with the meals he has arranged to serve 
during the next n days, he will need r/(>0) fresh napkins on the Jth 
day (J = 1 , 2, • • • , n). Laundering normally takes p days; that is, a soiled nap¬ 
kin sent for laundering immediately after use on the Jth day is returned in time 
to be used again on the (j + p) th day. However, the laundry also has a higher- 
cost service which returns the napkins in q < p days (p and q being integers). 
Having no usable napkins on hand or in the laundry, the caterer will meet his 
early requirements by purchasing napkins at a cents each. Laundering costs b 
and c cents a napkin for the normal and high-cost services, respectively, where 
b <c < a. How does the caterer arrange matters to meet his needs and minimize 
his outlay for the n days? 

Formulating this as a problem in linear programming and considering the 
case q = p — 1, Jacobs showed how the problem could be simplified by several 
transformations to such a degree that an explicit solution could be given. To 
this writer, Jacobs’ analysis appears as a mathematical tour de force, which fails, 
however, to shed light on the features of the problem that make the explicit 
solution possible. A similar feeling was recently expressed by Hoffman [2]. 

In the present paper, the Caterer Problem is shown to be equivalent to a Hitch¬ 
cock Distribution Problem [3] with a very special cost matrix. For the case 
q = p — 1, a simple procedure taking advantage of this fact is developed and 
shown to yield Jacobs’ solution. The possible extension of the procedure to the 
case p — q > 1 is illustrated by a numerical example. 

2. The Caterer Problem as a Special Distribution. Problem 

It is readily seen that the Caterer Problem can be formulated as a Distribution 
Problem in which the store and each day’s hamper of soiled napkins are the 
origins and each day’s requirement of fresh napkins and the final inventory of 
soiled napkins are the destinations. The cost of “shipping” a napkin from the 
store to any one of the n days is a. The cost of shipping a napkin from the Jth 
day’s hamper of soiled napkins to the kth day is b (when k — j ^ p) or c (when 
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TABLE I 

Truncated Cost Matrix 



Day 





Destination 







q+ 1 

Q + 2 

q + 3 ... 

p 

P + 1 

P + 2 

P + 3 

n 

Inv. 











1 

c 

c 

c 

c 

b 

b 

b 

b 

0 

HH 

2 

CD 

c 

C 

c 

c 

b 

b 

b 

0 

O 

HH 

3 

00 

CO 

c 

c 

c 

c 

b 

b 

0 












o 

n-q 

00 

00 

00 

00 

CO 

00 

00 CO 

(» 

0 


Store 

a 

a 

a 

a 

a 

a 

a a 

a 

00 


exclude such impossible shipments. Similarly, the cost of shipping a napkin from 
the store to the final inventory must be considered as infinite to exclude such 
wasteful shipments. Finally, the cost of shipping a napkin from any day’s ham¬ 
per of soiled napkins to the final inventory of napkins is zero. 

As usual, we present these costs in the form of a cost matrix with the rows 
corresponding to the origins and the columns to the destinations. Since the nap¬ 
kins for the first q days must be bought at the store and the soiled napkins of 
the last q days necessarily go to the final inventory, the first q columns and the 
last q rows may be omitted from the cost matrix. Table I shows the structure of 
the truncated cost matrix. The important feature of this matrix is its nearly tri¬ 
angular character: except for the last line, all entries below the main diagonal 
are infinite. 

Aside from the special structure of its cost matrix, our distribution problem 
has only one unusual feature: the total number of napkins that will be bought 
from the store is not known beforehand. A lower bound for this number is ob¬ 
tained as follows. Let R j denote the total number of napkins used during the 
first j days, 

7 

Rj = X) Ti. (1) 

iassl 

The greatest number of napkins that could be made available for use on thej'th 
day through laundering of napkins used on earlier days is R^ q , where the defini¬ 
tion (1) must be supplemented by the convention that Rj = 0 for j ^ 0. Thus 
if the difference Rj — Rj_ q is positive, it represents a deficiency of napkins for 
use on the jth day, which must anyhow be made up by the purchase of new 
napkins. The largest positive difference R 3 - for j = 1 ,n therefore 
is a lower bound for the number of napkins that will be bought. It may, of course, 
be advantageous to purchase in excess of this lower bound if the expenses for 
express laundry service can thereby be reduced sufficiently. 

3. The case q = p - 1; Numerical example 

To solve the Caterer Problem when q = p - 1, we assume at first that the 
number of napkms bought is given by the lower bound derived in the preceding 
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TABLE II 

Numerical Exam-pie 


j 

l 

2 

3 

4 

5 

6 

7 

8 

9 

10 

rj 

50 

60 

80 

70 

50 

60 

90 

80 

50 

100 

Ri 

50 

110 

190 

260 

310 

370 

460 

540 

590 

690 

Rj—Rj_2 

50 

110 

140 

150 

120 

110 

150 

170 

130 

150 


TABLE III 

Feasible Program (q = 2) 


j 

3 

4 

5 

6 

7 

8 9 

10 

V 

rj 

1 

20 

30 







50 

2 


40 

20* 


0* 




60 

3 



| 30* 

50* 





80 

4 



1 

10* 

60* 




70 

5 





30** 

20** 



50 

6 





! 

| 60** ** 



60 

7 






| 50** 

40 

0** 

90 

8 






| 

60 

20 

80 

S' 

60 




o** 




60 

r j 

80 

70 

50 

60 

90 

80 50 

100 

20 

600 


section. Having worked out an optimal program under this assumption, we 
finally check whether this can be improved by the purchase of additional napkins. 
To fa mi liarize the reader with the various steps of the suggested procedure, 
these will first be illustrated by a numerical example. It will be shown in Section 

4 that this procedure always leads to an optimal program. 

For our numerical example, p = 3, q = 2, and the requirements r, are given 
in Table II, which also lists the accumulated requirements Rj and the differences 
Rj — Rj- 2 . It is seen from the last row in Table II that S = 170 is a lower 
bound for the number of napkins that have to be bought. Accepting, for the 
present, this lower bound as the actual number of napkins bought, we work out 
the feasible program shown in Table III. Since this table is truncated in the 
same manner as Table I, the purchases appearing in the table are S' = 

5 — ?i — r 2 = 60 and the final inventory appearing in the table is 
I' = S - r 3 - r 10 = 20. 

Beginning at the right end of the eighth row and proceeding towards the left, 
we distribute ra = 80 into the eighth row cells so as to exhaust either the capaci¬ 
ties of these cells noted in the bottom row or the amount r 8 available for distribu¬ 
tion in the eighth row. We then proceed to distribute r 7 = 90 in a similar manner. 
The feasible program shown in Table III is obtained by continuing in this man¬ 
ner and programming purchases only if the column totals cannot be met other¬ 
wise. 

This program cannot be improved as long as purchasing remains restricted 
to S' = 60. Indeed, the only way in which the program could be modified while 
preserving the row and column totals is as follows. Select an even number of 


252 


V-19—DETERMINISTIC DECISION MODELS 


cells with finite cost in such a manner that no row or column contains an odd 
number of these cells. In Table III such a choice has been indicated by putting 
an asterisk m ea,ch selected cell. Join these cells by a closed path of alternating 
vertical and horizontal steps. Proceeding along this path, alternatingly increase 
and decrease the entries in the cells by the same amount. This obviously pre¬ 
serves the row and column totals, but yields a feasible program only if all pre¬ 
viously empty cells of the path receive positive entries. For the path indicated 
m Table III, this condition implies that the amounts in the cells selected on the 
main diagonal have to be increased. The considered modification of the program 
therefore leads to an increased use of express laundry service, i.e. to increased 
expenditure. It is readily seen that the arrangement of the entries in the cells 
of Table III is such that it is impossible to choose a path of ceUs yielding a de¬ 
crease in cost. 

So far, S' and hence also F were considered as fixed at S' = F = 60. Let us 
now check whether the cost of the program can be reduced by the purchase of 
additional napkins. An increase in purchase and final inventory involves changes 
in cells of an open path that begins in some cell of the row S' and proceeds by 
alternating vertical and horizontal steps to end in some cell of the column F. 
n Table III such a path is indicated by double asterisks. This particular path 
touches three cells on the main diagonal, two cells off this diagonal, and one cell 
each m the purchase row and the inventory column. If the entries in the cells 
of this path are alternatingly increased and decreased by the same amount 5, 
the entries in the cells on the main diagonal are all decreased and those in the 
cells off this diagonal are all increased. The change in cost is therefore given by 

S(a - 3c + 2b) = 8(a - b) (l - 3 C -~ 

If the considered modification of the program is to result in a saving, we must 


c — b 


<3, 


where the nght-hand side is the number of cells in the path that lie on the main 
diagonal. 

Obviously, the path indicated by the double asterisks in Table III is far from 
being the best of this kind. The best one begins in the first cell of the purchase 
row, ascends to the first cell of the first row and then descends in a stairlike 
ashion along the mam diagonal to end in the eighth row of the inventory column. 
Since this best path involves 8 cells on the main diagonal, the program of Ta- 
ble III cannot be improved at all if (a — b)/(c — b) > 8. 

To be specific, let us assume that 4 < (a - b)/(c - b) < 5. To lead to an 
improvement, a path must therefore have at least five cells on the principal 
agonal. For the eight-cell path considered above, the amount 8 is limited by 
the smallest entry (i.e. 10) on the main diagonal, because the modification must 
not lead to negative entries. The modified program is shown in Table IV. 
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TABLE IV 

Modified Program (p ~ 3, q — 2) 


j 

345 6789 10 

Inv. 

Tj 

1 

10* 40* 


50 

2 

| 30* 30* 


60 

3 

| 20* 60 0* 


80 

4 

1 70 


70 

5 

j 20* 30* 


50 

6 

| 50* 10* 


60 

7 

| 40* 50* 


90 

8 

| 50* 

30* 

80 

Purch. 

70* 


70 

Tj 

80 70 50 60 90 80 50 100 

30 

610 


TABLE V 

Optimal Program (p =* 3, q -* 2) 


j 

3 4 56789 10 

Inv. 

rj 

1 

50 


50 

2 

I 10 50 


60 

3 

| 60 20 


80 

4 

| 70 


70 

5 

| 50 


50 

6 

| 30 30 


60 

7 

1 20 70 


90 

8 

^ 30 

50 

80 

Purch. 

80 10 


90 

*i 

80 70 50 60 90 80 50 100 

50 

630 


The asterisks in Table IV indicate another path of this kind that includes 7 
cells on the main diagonal and hence leads to a further improvement of the pro¬ 
gram. The amount 8 for this path is given by the entry in the first cell of the first 
row in Table IV. After this second modification there remain six cells with posi¬ 
tive entries on the main diagonal, so that the technique can be applied once 
more. Table V shows the program after this third modification. Since there are 
now only four cells with positive entries on the main diagonal no further improve¬ 
ment is possible, as (a — b)/(c — d ) has been assumed to exceed 4. 

4. The case q = p — 1: General considerations 

We now investigate whether the procedure outlined in the preceding section 
could fail. In the course of this investigation an analytical description of the 
procedure will be obtained that shows the final program to be identical with 
Jacobs’. 

The lower bound for the number of napkins that have to be bought is 

S = max (Rj - R^), (j * 1 , 2, • * * , n) (2) 
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where 


fO for k g 0, 


R k = 1 

1 k 

Z n for k > 0. 

(3) 

For the truncated program (e.g. Table III), we need 


S' = S - R, 

and 

(4) 

r = 

S - R n + Rn- q . 

(5) 


In working out the first program, starting from the lower right, could we en¬ 
counter the following difficulty? As we distribute r k into the cells of the fcth row 
(k = 1, 2, • • - , n — q), proceeding from right to left and putting into each cell 
as much as is possible, can we reach the left-most usable cell of this row and 
still have more left to put into this cell than it will take? It is easily shown that 
this cannot occur. Indeed, the amount available for the left-most usable cell 
of the &th row is 


d k 



n 


z 


Ti 


- r 


4* 


(R n ~q — Rk- 1 — R n + Rk+q — J') + , (6) 


where ( * * * ) denotes 2 {( * * * ) + | (*••)[ }• Substituting V from (5), we 
obtain 


d k - (R k + q - R k _ x - S) + = (r k+q + Rk^ - - S) + ^ r k+q , (7) 


because S ^ R k+q -i — R k - 1 by (2). Since r k+q is the total for the column that 
contains the considered cell, and since d k is the first entry to be made anywhere 
in this column, the anticipated difficulty cannot arise. Our procedure thus leads 
to a feasible program. 

Next, we show that this program cannot be improved short of purchasing addi¬ 
tional napkins. To this end, we observe that our procedure implies certain spatial 
relations between the left-most entry in one row and the right-most entry in 
the next higher row. 

Let the left-most entry in row k occur in column l. The fact that something 
was put into column l in row k indicates that the requirements for column l + 1 
have been met. The first entry to be put into row k - 1 can therefore not be to 
the right of column Z; it will be in column l if the requirements of this column 
have not yet been met; otherwise it will be in column l — 1 since no entry has 
as yet been made in this column. Thus, the right-most entry in a row is either 
in the same column as the left-most entry in the next lower row or immediately 
to the left of this column. 

From this pattern there follows an important property of any path that starts 
from a cell m the main diagonal and proceeds by alternating vertical and hori¬ 
zontal steps to return to the starting point without touching any cell twice or 
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descending below the main diagonal: the top right corner of such a path falls 
necessarily into an empty cell. To be acceptable, any program change effected 
by alternatingly decreasing and increasing the entries in the cells of this closed 
path by a fixed quantity 8 , must increase the amount in this previously empty 
cell and hence also the amount in the starting cell on the main diagonal. Such a 
program change cannot, therefore, constitute an improvement because it sub¬ 
stitutes high-cost laundry service for normal service. 

We must now check whether our tentative program can be improved by the 
purchase of additional napkins. It follows from the discussion of this point in the 
preceding section that this is possible, whenever the number of non-vanishing 
entries d k on the main diagonal exceeds (a — b)/(c — 6). Let us order these 
entries d k by descending magnitude repeating, if necessary, each value according 
to its multiplicity. If m is the least integer greater than (a — b)/(c — b ) and d k * 
the mth of the ordered values d k , it is clear that repeated application of the 
process discussed in reducing the program of Table III to that of Table IY will 
eventually reduce to zero the entries in all main diagonal cells that contained 
values d k of a higher order than d k *. At this stage, the number of non-vanishing 
entries on the main diagonal has fallen below m and further improvement of the 
program has become impossible. The final entries on the main diagonal are 
therefore given by 

*k = (A - d k *) + . (8) 

This agrees with the first Eq. (2.7) in Jacobs’ paper, since our d k corresponds to 
Jacobs’ H k+q . Once the number of napkins that is sent to the high-cost laundry 
service is known for each day, the number that must be bought on each day can 
be worked out by a simple book-keeping method similar to the one that furnished 
Eq. (2). This yields the second Eq. (2.7), of Jacobs’ paper and the third Eq. (2.7) 
follows from this. Actually, the stepwise improvement procedure suggested here 
furnishes the high cost service and purchase requirements about as fast as they 
could be computed from Jacobs’ formulas. 

5. The case q = p — 2: Numerical example 

The present synthetic approach has a practical advantage over Jacobs’ ana¬ 
lytical treatment of the problem: it is readily generalized to the case where 
p — q, though small, exceeds unity. To show this, let us modify the example of 
Table II by assuming that q = 2, as before, but p = 4 . The program in Table III 
is still feasible, but in trying to improve it, we must keep in mind that now not 
only the cells on the main diagonal but also the cells immediately to the right 
of this diagonal are high-cost cells. As a consequence of this, it is no longer true 
that we cannot improve this feasible program short of buying additional napkins. 
Table VI shows the program obtained after improving the first program as 
much as is possible without increased purchase of napkins. Each improvement 
step adds an amount 8 to a cell on the main diagonal and to the cell immediately 
to the upper right of it, while removing the amount 8 from each of the two re¬ 
maining cells of a square of four cells. 
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TABLE VI 


Improved Program (p *« 4, q =* 2) 


3 

3 

4 5 

6 

7 

8 

9 

10 

Inv. 

Tj 

1 

20 * 

10 20 * 







50 

2 

l 

60 







60 

3 


| 30* 


50* 





80 

4 


1 

60 


10 




70 

5 




I 4°*, 

10 

0 * 



50 

6 

7 

8 




L 

60 

T 

50* 

20 

20 * 

60 

90 






_1 

80 


80 

Purch. 

60* 








60 

*j 

so : 

70 50 

60 

90 

80 

50 

100 

30 

600 



TABLE VII 
Optimal Program (p = 4 , 

2 = 2) 




j 

3 

4 5 

6 

7 

8 

9 

10 

Inv. 

l 

**j 

1 

10 40 







50 

60 

80 

70 

50 

2 

L_ 


60 






3 


1 10 


70 





4 


L 



70 




5 



L 

20 

10 

20 



6 




L 



60 


60 

7 

8 






30 

20 

40 

90 






_1 

20 

60 

80 

140 

Purch. 

80 60 







rj 

80 70 50 

60 

90 

80 

50 

100 

100 

740 


We next check whether further improvements are possible by additional pur¬ 
chases of napkins. The technique is essentially the same as before, except that 
the stair-like path down the main diagonal now must have horizontal or vertical 
steps of the minimum length 2, if it is to avoid the high-cost cells off the main 
diagonal. Such a path is indicated by asterisks in Table VI. Since it has 4 cells 
on the mam diagonal, it will lead to a program of lower cost only if 
(a - b)/(c - b) < 4 . Let us assume that this is the case, but that 
(a - b)/ ( c-b)> 3 . The optimal program shown in Table VII is then obtained 
by applying this improvement procedure twice, once for the path marked in 
Table VI and then for a similar path whose steps are one cell to the right and 
below those of the first path. While an analytical description of the optimal 
program for p- q > l could be worked out, it appears highly doubtful that it 
would yield this program faster than the synthetic procedure followed here.* 


mce this paper was written, Dr. George B. Dantzig has informed the author that 
the equivalence of the Caterer Problem to a Transportation Problem has been recognized 
for some time by the members of his research group at the Rand Corporation. More re- 
aiS™’ q- Se6I ? S even t0 have been stat ed in print in a report by S. Hoch (USAF- 

^ aUth °: hM DOt b6en aWe t0 0btain a copy of this report . he cannot com¬ 
ment on the relation between the present approach and that of Mr. Hoch. 
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LETTERS TO THE EDITOR 


Dear Sir: 

There are two points arising out of Professor W. Prager’s paper “On the 
Caterer Problem”, Management Science S (1956), pp. 15-23, which seem of 
interest. 

One is that the whole Caterer Problem can be regarded as a Transportation 
Problem, even if the number of napkins to be bought is not regarded as known, 
if one introduces a large number of napkins originally (i.e. at least as many as 
the largest sum of p successive daily requirements) but regards the cost of 
transporting a new napkin to the final inventory as zero—this meaning that the 
napkin is not really bought at all. 

The other point is that if the total number of napkins is given, and there are 
only two types of laundry involved, the solution can be written down straight 
iaway as follows: 

Consider the requirement at each “destination” in turn, and satisfy it accord¬ 
ing to the following rules. 

1. Always use new napkins until the stock of new napkins is exhausted. 

2. If there are not enough new napkins available, arrange as far as possible 
to supply napkins last used p or more days earlier through the slow laundry. 

3. Use the fast laundry to satisfy any additional demand that cannot be met 
by these two means. This is Prager’s solution when p — q = 1. But there is an 
important proviso when p — q > 1. When using the fast laundry, napkins that 
were last used most recently should be selected. Thus one should prefer to have 
napkins last used q days ago, rather than napkins last used q + 1 days ago, and 
these in turn should be preferred to napkins last used q + 2 days ago (as long 
as q + 2 < p), and so on. The point of this proviso is that if napkin A was 
last used q days ago, it could not be recovered from the slow laundry for another 
P ~~ <1 days; while napkin B which was last used q + 1 days ago could be made 
available from the slow laundry after another p - q - 1 days. If there happens 
to be a heavy demand for napkins p - q - 1 days later, then if napkin A is 
used now, napkin B can be recovered in time from the slow laundry. But if 
napkin B is used now, napkin A cannot be recovered in time from the slow 
laundry. 

It is a straightforward matter to prove that the above scheme necessarily 
produces an optimal solution if one exists, though there may well be other equally 
cheap solutions. 

Yours faithfully, 

E. M. L. Beale 
Admiralty Research Laboratory 
Tedding ton, Middlesex 
England 
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INVENTORY DEPLETION MANAGEMENT* 1 

CYRUS DERMAN and MORTON KLEIN 

Columbia University 

Consideration is given to problems of choosing the order of issue of items 
from a stockpile of material whose utility characteristics are changing with 
time. Conditions are given under which either LIFO (last in, first out) or 
FIFO (first in, first out) is an optimal issue policy. 

1. Introduction 

During the past decade there has been an intensified interest in inventory 
problems. Studies in this area have concentrated on the development and evalu¬ 
ation of stock ordering policies. Two basic papers on this subject are [1] by 
Arrow, Harris, and Marshack and [2] by Dvoretzky, Kiefer, and Wolfowitz. 
For an elementary exposition, [5] the paper by Laderman, Littauer and Weiss 
is available. 

Recently Greenwood [3] and Heit [4] have drawn attention to the problem 
of development and evaluation of stock issuing policies. In this paper three 
problems in this area will be considered: 

Problem 1. A stockpile consists of n items 2 . Associated with the fc-th item is 
an age (length of time in the stockpile) Si(i = 1, • • * , n). The field life of an 
item is a function, LOS), of the age of the item upon issue to the field. When 
an item’s usefulness or life in the field is ended, a new one is issued from the 
stockpile. Items are to be issued successively until the stockpile is depleted. 

The problem of interest, here, is that of finding the order of item issue which 
maximizes the total field life obtained from the stockpile. 

Issue policies which permit the replacement of an item in the field before its 
usefulness is ended will not be considered here. This case is discussed in Sec¬ 
tion 7. 

Problem 2, An inventory consists of n items of ages Si , * * • , S n . Let X(S) 
be a random variable, and U(S) its expectation, which denotes the utility to 
management of an item of age S when issued. A withdrawal schedule is given 
which specifies the times at which items will be required. It is assumed that 
the schedule exhausts the stock. 

It is required to find that order of stock issue which maximizes the total ex¬ 
pected utility obtainable from the n items while meeting the given demand 
schedule. 

* Received January 1958. 

1 Research under contract with the U. S. Army Chemical Corps Engineering Command. 
Originally issued as Technical Report No. 2, Oct. 15, 1957, of the Statistical Engineering 
Group, Columbia University. 

2 It may be noted that there are situations in which “lot” instead of “item” is appro¬ 
priate in the statement of all of the listed problems. 
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It should be noted that Problems 1 and 2 differ in two respects: 

(i) For purposes of mathematical simplicity, it is assumed that a deter¬ 
ministic relationship (L(S)) exists between the age of an item and its field life, 
i.e., L(S ) is not the expectation of a random variable, and 

(ii) the schedule of usage in Problem 2 is independent of the function U(S), 
whereas the usage schedule in Problem. I is a function of L(S). 

The next problem is a more dynamic version of Problem 2. 

Problem 3. A stockpile contains n items of different ages. At k different times 
a new item is added to stock. A withdrawal schedule is given. 

It is, again, of interest to determine the order of item issue which maximizes 
the total expected utility obtainable from the use of all of the items. 

If complete knowledge of the functions L(S) and U(S) is available, then 
optimal policies for any given situation can be obtained by a consideration of 
all n! different orderings and a selection of the best. In the case of Problem 2, 
computational procedures which are available for the solution of the “optimal 
assignment problem” may be utilized. 3 However, there are circumstances of 
even greater interest. In most real cases only limited knowledge of the func¬ 
tions L(S) or U(S) and the ages (Si) is available; consequently, complete enu¬ 
meration or the utilization of a computational procedure is not possible. 

Greenwood [3] and Heit [4] have reported that the most frequently used de¬ 
pletion policies are LIFO (last in, first out) and FIFO (first in, first out). A 
LIFO (FIFO) policy would be that of always using the youngest (oldest) item 
on hand first. Greenwood was primarily interested in a comparison between 
these two stock issue policies for the case in which the field life function is 
linear. Heit also compared these policies and several others, which, however, 
require for their implementation complete knowledge of both the deterioration 
function and the age of each item. 

A LIFO (FIFO) policy is of interest to us since its utilization requires in¬ 
formation concerning only the relative ages of the items. The purpose of this 
paper is to give sufficient conditions on L(S) or U(S) under which LIFO will 
be optimum over all possible policies. The case where FIFO is optimum will be 
briefly discussed. 


2. The Main Results 

Theorem /. If 

(i) L(S) is a non-negative, non-increasing convex function, and 
(n) LIFO is an optimal policy for Problem 1 when n = 2, then LIFO is an 
optimal policy for Problem 1 for n = 3, 4, • • • . 

Theorem 2. If U(S) is a convex function, then LIFO is an optimal policy for 
Problem 2. 

Theorem 3. If U(S) is convex, then LIFO is an optimal policy for Problem 3. 


3 See, for example, Kuhn [6] or Munkres [7], 
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A function / is convex if for every pair of values X \, x 2 and all values a \, a* 
such that 0 ^ a x , a 2 ^ 1, ai + a 2 = 1, 

/(a x xi + a 2 x 2 ) g ai/(xi) + orffa). 

That is, if a line connects any two points of the graph of a convex function then, 
all intermediary points of the graph are never above the line. For a concave 
function, the direction of the inequality is reversed. 

Although complete knowledge of L(S) and U(S) may be lacking, it is pos¬ 
sible to determine, sometimes from a priori considerations, sometimes empiri¬ 
cally, that U(S) is convex. From this point of view, Theorems 2 and 3 represent 
satisfactory results. However, Theorem 1 falls short in that it does not provide 
such a simple characterization. We shall give below some functions which 
satisfy (i) and (ii). 


3. Proof of Theorem 1 

Suppose that LIFO is optimal for n = ft 2. Let n = ft + 1 and 0 < 
Si < S 2 < • • * < jS* + 3L be an arbitrary set of initial ages. For any issue 
policy, let S* denote the age of the last item issued. First, observe that none of 
the fe! policies having S* = S t can be optimal, for in any such policy, if Si 
denotes the initial age of the item issued next to last, we have Si > Si, and by 
hypothesis this policy could be improved by interchanging the order of issue 
of these last two items. 

Now for S* t* Si, let x denote the total field life obtained from the issue of 
the ft preceding items and x* denote the largest possible value of x , so that by 
the inductive assumption, x* ^ Z/(Si). Let Q(x) = x + L(x + S*) denote the 
total field life of all ft + 1 items. Since L is non-increasing and Si < S*, it 
follows that L(Si) ^ L(S*) and 

Q(x*) = x* + L(x* + S*) ^ x* ^ L(Si) ^ L(S*) = 0(0). 

Thus, since Q is convex, Q is maximized for fixed S* by x = x*, which is ob¬ 
tained by using a LIFO order on the first ft items. 

By letting S* vary over S 2 , S3, * * *, Sk+u we obtain ft policies. Suppose the 
optimal among these is not the one with S* == Sjt+i, but some other. Then 
since x* is a result of LIFO order, the item of age Sk+i was issued next to last, 
and by hypothesis the policy could be improved by interchanging the order of 
issue of these last two items. Hence, the optimal policy must be the one that 
has S* = S* + i, which is precisely the LIFO policy with n = ft + 1. The 
theorem follows by induction. 

As a matter of mathematical interest the question arises as to whether con¬ 
dition (ii) implies condition (i), i.e., does LIFO as an optimal policy for the case 
n = 2 require L to be convex. As a counterexample, consider the case of the 
non-convex function defined as follows: 



262 


v “21 DETERMINISTIC DECISION MODELS 


0 ^ S < 1 

1 ^ s. 


L(S ) = 2 
= 1 

It is easy to verify that (ii) is satisfied. 

4. Proof of Theorem 2 

seSr C 2 T an T ? the T S ^ Sl “ d with & < ^ ■ Let a denote the 
scheduling interval. It must be shown that 

(1) U(S 1 ) + U(S 2 + a) 2: U(S 2 ) + */(& + a). 

U(S) is a convex function, hence, 

U (S 2 ) - U(SQ < U(S 2 + a) - Ufa + a) 


S 2 — Si 


Equivalently 


(S 2 + a) — (Si + a) 


U(S 2 ) - U(Sd g U(S 2 + a) - U(Si + a) 

which yields (1). 

Now let n be any integer greater than 2. Any policy other than LIFO will 

^ thCre ^ be tW0 successive items issued in such a way 
that the first item issued will be older than the second. However, using (1), it 
follows that such a policy could be improved by interchanging the order of 
issue of these two items. This process may be continued until no further inter¬ 
changes are advantageous and the LIFO order is reached. Thus, since no policy 
provides a greater expected utility than LIFO, Theorem 2 is proved. 

5. Proof of Theorem 3 

• *r° nSld< j r &St the ca se in which there are n original items in the stockpile and 

™ ^ added at SOme time - For an y stock ^ue policy, the total 

ejected utihty is composed of two parts: the expected utility from the original 

item and the expected utihty from the new item. It follows from Theorem 2 

3“? Where tbe new ltem is Piaced in the issue order, the expected 

utihty from the original items is maximized by the use of a LIFO policy for 
e originals. Hence, the optimal policy must be of the form in which the origi¬ 
nal items are used in the LIFO order. S 

t ,^°;rr ide V he t ,? al expected utUity as bein g composed of the expected 
utihty evolving from the items used prior to the entrance of the new item and 

T- the remainder of the stockpile (including the new 
tern). Since a LIFO ordering is used on the original items, the expected utihty 
s maximized by maximizing the second part. However, by Theorem 2 this is ac¬ 
complished by using a LIFO ordering for these items. 

The case m which k new items are added at different times to the stockpile 
is easily proved by induction. p e 



V-21 —INVENTORY DEPLETION MANAGEMENT 


263 


6. Special Field Life Functions 

In this section we show that certain classes of functions satisfy conditions (i) 
and (ii) of Theorem 1. At present, we have no satisfactory characterization of 
such a class. We shall say that a function belongs to the class & if it satisfies 

(1) and (ii). 

Theorem 4. If L(S) is of the form ( a/b + S)(a > 0, b ^ 0) then L belongs 
to e. 

Proof: (i) is clearly satisfied. Let Si, St be any non-negative real numbers 
with Si < St. To prove (ii) we must show that 

(2) L(Si) + L(St + L(Si)) - L(St) - L(Si + L(St)) > 0. 

On substitution we get 


b + Si 


+ 


l + ( & + 5T5) 


b + & 


1 + ( Sl + FTs.) 


a __ a 
b -f- Si b -j- $2 
b 2 + b(Si ■+- S2) + S1S2 + a 


> 0 


since 


a ^ a 

b~+~si > 

Hence (2) is established. 

Theorem 5. If L(S) is of the form ce~ ks (c, k > 0) then L belongs to e. 
Proof: (i) is again clearly satisfied. In order to prove (2) and hence (ii) we 
consider the function of S for fixed Si 

F(S !, S) = L(Si) + L(S + L(S0) - L(S) - + L(S)). 

In the case under study we have 

F(Si, S ) = ce~ kSl + ce ~ k( - s+cc ~ kSl) _ ce~ ks - 

Clearly we have F(Si, S{) = lima-,* F(Si, S) = 0. Also 

+ ke~ ks - ck 2 e- ks ~ HSl+c ‘~ iS) } 
db 

= ck[e - ks (1 - cke~ HSl+c ‘~ kS) ) - 

= cke~ ks [ 1 - cke~ kl ' Sl+c ‘~ kS) - e""'** 1 ]. 

The first factor of the above is always positive. The second factor, since its 
derivative is negative is either always negative or at first positive and then 
always negative. The first contingency is ruled out because of the values of 
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F(Si, S) at Si and °°. Thus the second is the case and therefore F(Si S) > 0 
f or ail S > Si. 

7 . Partial Use of Items in Problem 1 

It was previously remarked that, in the determination of the optimum issue 
policy for Problem 1, only those policies in which an item is completely used 
were considered. It is conceivable that a policy calling for only partial use of 
some of the items might yield a greater total field life than the optimum ob¬ 
tained over the class of policies which were considered. 

Consider the case in which n = 2 and the function L(S) satisfies conditions 
(i) and (ii) of Theorem 1. A partial usage policy would be to use one of the 
items just for T units of time, where 0 ^ T 2S L(Si), i = 1 or 2 according to 
whether the younger or older of the two items is used first, and then use (com¬ 
pletely) the remaining item. 

The problem is to choose i and T in order to maximize the function 
Q(i, T) = T + L(S^i + T). 

Since T and L are convex Q(i, T) achieves its maximum either at T = 0 or at 
the largest possible value of T. Using this fact and conditions (i) and (ii) of 
Theorem 1, we have 

maXi.rQfo T) = max [Q( 1 , 0), Q( 2, 0), Q( 1 , £,(&)), Q{ 2, £(&))] = 

max [Q(l, L(Si)), Q(2, L(S 2 ))] = Q{ 1, L(Si)). 

Hence, LIFO (using items until failure) is still the optimum issue policy The 
method of induction, as used in the proof of Theorem 1 can be used again to 
show that LIFO is optimal for n > 2. 

8. FIFO as an Optimal Policy 

D : L(S) is linear and decreasing it can be seen easily that FIFO is optimal for 
FrobUm 1. This rules out any such result as convexity of L(S) being a sufficient 
condition for the optimality of LIFO for Problem 1. 

If U(S) is concave, then a reversal of the argument shows that FIFO is opti¬ 
mal for Problems 3 and 3. 

Note that these remarks are somewhat less than rigorous, since L(S) linear 
(or U(S) concave) and decreasing would lead to negative values of U and L 
or arge enough S. Thus, it is tacitly assumed that the ages are within a range 
such that (7, L are positive. 

9. Remarks 

wTwf- CiiaraCteriZatl0n . 0f the ex P ected utility functions for which LIFO and 
FIFO inventory depletion policies are optimal, even for such simple problems 
as were considered, is fairly useful: 

(a) These policies are easily understood and for the most part easily imple- 
mented in practice. 
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(b) For Problems 2 and 8 neither the demand schedule nor the replacement 
schedule need be known or regular; the precise function need not be known 
either. Only information as to the convexity or concavity (within limits) 
of the expected utility function is required for an optimal policy to be 
chosen. Since the optimality of LIFO holds for any given schedule, it is 
easy to see that it is also optimal for a random schedule, provided that the 
scheduling is independent of the issue policy. This information may be 
used to approximate a solution to the following more general version of 
Problem 1 : 

Suppose there are, instead of one, many sources of demand. This changes 
the situation so that instead of the issue of the i- th item awaiting the 
failure of the (i — l)-th item, the demand schedule takes on the character 
of a random demand schedule. Since LIFO is optimal for Problem 2 } it is 
reasonable to utilize it as an approximately optimal policy for this prob¬ 
lem also. 

(c) Reasonable approximations to many other functional forms can be ob¬ 
tained via convex (concave) functions. Hence, an even wider range of po¬ 
tential approximate application for Theorems 2 and 8. 

It should be noted that even the narrow scope of the problems posed here is 
not fully explored. Other kinds of objective functions are clearly of interest as 
are the policy requirements imposed by other kinds of field life and expected 
utility functions. 
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LIFO YS FIFO IN INVENTORY DEPLETION 
MANAGEMENT* 

GERALD J. LIEBERMAN 
Columbia and Stanfovd XJjvivcvsities 
1* Introduction and Purpose 

In a recent paper (1), Derman and Klein consider the following problem - A 
stockpile consists of n items. Associated with the tth item is an age (length of 
time m the stockpile) S t (* = 1, 2, • • • , n). The field life of an item is a function, 
MS) (£ 0), of the age of the item upon issue to the field. When an item’s useful¬ 
ness or life m the field is ended, a replacement is issued from the stockpile. Items 
are to be issued successively until the stockpile is depleted. The problem of 
interest is to find the order of item issue which maximizes the total field life 
obtained from the stockpile. 

In (1), it is pointed out that if a complete knowledge of the function L(S) is 
available, then optimal policies for any given situation can be obtained either by 
a consideration of all n! different orderings and the consequent selection of the 
best, or by using an algorithm which wifi lead to the solution. For large n (and 
even moderately sized n) heavy numerical calculations are involved. Further¬ 
more, in the usual circumstances there is only limited knowledge available 
about the function L(S) and the ages (Si), e.g., general shape of L(S) and pos¬ 
sibly the ranking of the Si , making it impossible to use these techniques to find 
an optimal policy. 

In their paper, Derman and Klein present sufficient conditions on L(S) under 
which a LIFO (last in, first out) policy is optimal. This corresponds to issuing 
the youngest item on hand first, and its utilization requires information concern¬ 
ing the relative ages of the items, rather than their absolute ages. In particular 
they show that if ’ 

(i) L(S) is a non-increasing convex function, and 

(ii) LIFO is an optimal policy when n = 2, then LIFO is an optimal policy 

for n = 3, 4, • • • . J 

The purpose of this paper is to present an alternate set of sufficient conditions 
on L(S) under which a LIFO policy is optimal. In addition, this paper will present 
two sets of sufficient conditions on L(S) under which a FIFO (first in, first out) 
policy is optimal. This corresponds to issuing the oldest item on hand first, and 
its utilization, like LIFO, requires information concerning only the relative ages 
of the items. Moreover, the second set of conditions will not involve verifying the 
results for the case of two items (n = 2). 

* Received March 1958. 

t This work was partially supported under contracts DA 18-108-cmL-6125 for the U S 
OmZ Jf N mi ?R Corps , En S ineering Command, N6onr-25126 and Nonr-266(33) for the 
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2. A Set of Sufficient Conditions under which LIFO Is Optimal 

The following theorem gives a set of sufficient conditions under which a LIFO 
policy is optimal. 1 

Theorem 1 : 

(i) Ifg =L'{S) Sg -land 

(ii) LIFO is an optimal policy when n = 2, then LIFO is an optimal policy 

for n = 3, 4, • * * . 

Proof: The proof is similar to that used by Derman and Klein, and will be by 
induction. The function to be maximized by choosing an optimum issue policy 
may be written in the form Q{x) = x + L(x + S*) where x denotes the field 
life resulting from the use of the first n — 1 items issued and S* is the initial 
age of the nth item issued. L(x + S*) is then the field life resulting from the 
nth item issued. From condition (i) it follows that Q(x) is a non-decreasing 
function since Q'(x) = 1 + L'(S) is always non-negative. Hence Q(x) is maxi¬ 
mized by making x as large as possible. Derman and Klein use their condition 
(i) only to show that Q(x) is maximized by making x as large as possible. Hence, 
the remainder of the Derman-Klein proof is applicable since their condition (ii) 
is the same as that given above. The proof will be repeated here for the sake of 
completeness. By the induction assumption x is maximized by using a LIFO 
policy in the first n — 1 items issued. Now, if the issuing policy is such that the 
oldest item is not issued last, it follows that in maximizing x it will be issued next 
to last. If so, it follows from condition (ii) that the issuing policy could be im¬ 
proved by interchanging the order of issue of the last two items. Thus, the optimal 
policy must be of the form where the oldest item is issued last. This being estab¬ 
lished, the theorem follows from the application, again, of the induction as¬ 
sumption. 

It should be pointed out that condition (i), L'(S) ^ —1, does not imply con¬ 
vexity of L(S) nor does convexity of L(S) imply that L'(S) ^ — 1. Neither is a 
stronger condition than the other. 

3. A Set of Sufficient Conditions under which A FIFO Policy Is Optimal 

The following theorem gives a set of sufficient conditions under which a FIFO 
policy is optimal. 

Theorem 2: 

(i) If L'(S) £ -1, and 

(ii) FIFO is an optimal policy when n = 2, then FIFO is an optimal policy 

for n = 3, 4, • • • . 

Proof: The proof will be by induction. The function to be maximized by choos¬ 
ing an optimum issue policy may be written in the form Q(x) = x + L(x + S*) 
where x denotes the field life resulting from the use of the first n — 1 items 
issued and S* is the initial age of the nth item issued. L(x + S*) is then the 
field life resulting from the nth item issued. From condition (i), Q(x ) is maxi- 


1 This result was obtained jointly with Professor A. Dvoretsky. 
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mized by making x as large as possible. By the induction assumption x is 
maximized by using a FIFO policy on the first n — 1 items issued. Now if the 
issuing policy is such that the youngest item is not issued last, it follows that in 
maximizing x it will be issued next to last. At this point there are only two items 
left. Hence, from condition (ii) the issuing policy could be improved by inter¬ 
changing the order of issue of the remaining two items. Thus, the optimal policy 
must be of the form where the youngest item is issued last. This being established, 
the theorem follows from the application, again, of the induction assumption. 

Thus, it is interesting to note that the problem of determining whether LIFO 
or FIFO is an optimal policy resolves itself into determining whether LIFO or 
FIFO is optimal for the case n = 2. 


4. A Set of Sufficient Conditions under which FIFO Is Optimal for n = 2 

The following theorem gives a set of sufficient conditions under which a FIFO 
policy is optimal. 

Theorem 3: 

(i) IfZ/(S) ^ -land 

(ii) L(S) is a non-increasing or non-decreasing concave function, then FIFO 

is an optimal policy. 

Proof-. Condition (i) implies that if the theorem is true for n = 2, it is true for 

all n. Hence, it is sufficient to show that condition (ii) implies that the theorem 
is true for n = 2. 

A) Assume that L(S) is a non-increasing function. 

If Si > Si, it is necessary to show that 

* L(Si) + L[Si + L(&)] ^ L(Si) + L[Si + L(Si)]. 

Since L(S) is non-increasing, it follows that 


L[& + L(Si)} g L[£ 2 + £(&)]. 

Using this inequality, it follows that * holds whenever 


L(Si) + L[S x + L(Si)] Z L(S 0 + L[S t + L(S 2 )]. 
From condition (ii), L(S) is concave. Hence, 

L(Sj) - LQSQ > L[Sj + LQS 2 )] - L [& + L(g 2 )1 

S * - Sl 1*2 + £(&)] - [Sj + £,(&)] 

Equivalently, 


L (Si) + L[S i + L(S 2 )] £ L(Si) + L[S 2 + L(Si)], 

and the result is obtained. It has been tacitly assumed that L(S) is concave only 
for S such that Lis positive. In the above proof the range of the argument never 
goes outside of this region provided S 2 lies within since [S 2 + £(£,)] is the maxi¬ 
mum value of 5 considered, and this must always be in the region where the 
function is concave when L'(S) Z -1. If * 2 is such that L(Sf) is already 0, * 
holds trivially for all values of Si < S 2 . 
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B) Assume that L(S) is a non-decreasing function. 

Again it is necessary to show that * holds. Since L(S) is a non-decreasing 
function 

L[S 2 + L(SJ] S L[S 2 + L(S 2 )l 
and the same proof as in (A) goes through. 

5. Remarks 

In the statement of the original problem, only those policies in which an item 
is completely used were considered. It is conceivable that a policy calling for 
only partial use of some of the items might yield a greater total field life than 
the optimum obtained over the class of policies which were considered. Derman 
and Klein show that under their conditions LIFO (using items until failure) is 
still the optimum policy. Essentially the same proof will go through for Theo¬ 
rems 1, 2, and 3 in this paper, with, of course, the optimal policy being FIFO 
(using items until failure) for the last two theorems. 

For Theorem 3, the precise function L(S) need not be known. Only information 
as to its concavity (within limits) and its derivative never being less than minus 
one is required for an optimal policy of FIFO to be chosen. 

Finally, a result similar to Theorem 3 for LIFO is desirable since Theorems 1 
and 2 require that L(S) be known exactly in order to verify the results for 
n = 2. 
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DISCUSSION: SEQUENCING n JOBS ON TWO MACHINES 
WITH ARBITRARY TIME LAGS* 

S. M. JOHNSON 
The BAND Corporation 

This note presents an alternate proof of a result of L. G. Mitten, solving the 
problem of sequencing n jobs through two machines with arbitrary time lags 
when the job sequences are the same for both machines. The more difficult 
general problem is also discussed and partially solved. 

Introduction 

In [2], L. G. Mitten solved the problem of sequencing n jobs on two machines 
with arbitrary time lags while minimizing total elapsed time, when it is assumed 
that the job sequences on both machines are the same. He mentioned that the 
general problem would sometimes involve different job sequences on the two 
machines and hence would be quite diffi cult, 

In the present note an alternate proof of Mitten’s result is shown to follow as 
a corollary of a special three-stage problem solved in [1] when properly inter¬ 
preted. Also Mitten’s general case is discussed and partially solved. 

An Alternate Derivation of Mitten’s Result 

In [2], L. G. Mitten solved the following problem. Each of n jobs must be run 
first on machine I and then on machine II. Running times (including any set-up 
or tear-down times) for the i- th job are A, on I and on II. Let D { > 0 be the 
arbitrary time lag associated with item i, such that job i is not started on II 
sooner than D € time units after it was started on I, nor finished on II sooner 
than Di time units after it was finished on I. A rule is given for determining the 
sequence in which the jobs are to be run on the machines, using the same se¬ 
quence for both machines, in order to minimize the time between the start of 
the first job on machine I and the completion of the last job on machine II. 

Mitten’s solution to this problem is as follows. Sequence those jobs whose 
Ai < Bi in order of increasing value of D*, all such jobs coming before jobs 
whose Ai > Bi , which in turn are sequenced in order of decreasing value of D { . 

We proceed as follows. 

Write Di , the arbitrary time lag for the i-th job lot, as 
(!) Di = Mi + min (A», B { ) 

where > -min (Ai , Bi). 

If Mi > 0, we can interpret' Mi as the processing time of the i-th job lot on 
all intermediate stages or non-bottleneck machines. 

If Mi < 0, then we can interpret this case (following Mitten) as one involv- 

* Received Nov. 1958. 


270 



y— 23 —SELECTING 71 JOBS AND ARBITRARY TIME LAGS 


271 


ing lap-phasing, starting an item on the next stage before its entire job lot is 
available. 

Then Mitten’s problem can be interpreted as a restricted three-stage problem, 
which was treated in [1]. 

The general three-stage time matrix discussed in [1], in terms of present nota¬ 
tion, is 


A 1 

A 2 




An 

Mi 

M 2 




M n 

Bi 

Bi 




B n 


For this sequence So = (1, 2, - * * , n) the total time to process all the jobs is 

(2) T(S 0 ) = max (Z A< + £ M t + £ B.). 

l<iU< t v<n \i=l %—u t=u / 

We wish to permute the columns of the time matrix to find the minimum T(S). 
This three-stage problem was solved in [1] only for a special set of assumptions 
leading to the conclusion that u = v in (2). In the present problem the same 
restriction u = v holds since the total time for a given sequence, S = (1, 2, 3, 

••• , *0, is 

(3) T(S) = max jz A t + M u + £ bA, 

l:£w<;n. i—u J 

that is, there is no bottleneck on the intermediate stage. Then the solution in 
[1] was shown to reduce to an equivalent two-stage problem with an optimal 
solution given by the transitive rule: Item i precedes item j if 

(4) min (A* + Mi , B; + Mj) < min (Ay + Mj , Bi + Mi) 

with ties ordered either way. In [1] it was shown that this led to an easy method 
of scheduling: 

Find the smallest number of the set of 2 n numbers (A* + Mi , Bi + Mi), 
i == 1, 2, • • * , n. If it is an + Mi , place that item i first; if it is a Bi + Mi , 
place that item last. Then repeat on the reduced set of items until all are or¬ 
dered. 

Mitten’s rule gives an alternate interpretation of (4). 

If Ai < Bi , Bj < Ay, item i precedes item j from (4). 

If Ai < Bi , Ay < Bj , then (4) implies item i precedes item j if Ay + Mi < 
Ay + M y, that is, if D t < Dy. 

Similarly, if Bi < Ai , J5y < Ay, item i precedes item i if Dy < Di . 

The two rules are equivalent (except possibly for conventions concerning 
ties). In any case, both give the same total time. 

Note that Mitten’s rule also applies to the standard two-stage problem in 
[1] and is perhaps easier to remember than the rule given in [1]. 




272 


V-23 —DETERMINISTIC DECISION MODELS 


The General Case Allowing Different Job Sequences 

In the problem not treated by Mitten, where different job sequences are 
allowed for the two bottleneck machines, the following remarks will reduce the 
problem in most cases down to a relatively small list of sequences whose total 
times can be compared and the optimal sequence found. 

One can easily show that for every sequence Si on machine I there is an opti¬ 
mal sequence on machine II given by processing items in order of their availa¬ 
bility times on II. For suppose the job sequence on II is not in the samp order 
as the sequence where U is the availability time for item i on II for a given 
fixed sequence Si on I. If items i and j of iSn are not in the same order as i,- 
and tj , then they both must have started on II at a time after max (U , tj). 
Thus we can interchange consecutive items i and j on II without loss of time. By 
successive interchanges, starting from the left, of consecutive pairs of those 
items which axe out of order we can reorder S u to match up with the sequence 
{<,•} without loss of time. Then start each item on machine II as soon as possible. 
This reduces the problem from (n!) 2 cases to n! cases. 

But then use symmetrical arguments (reversing the time scale) to find an op¬ 
timal sequence on I for a given sequence on II. Repeating this process we even¬ 
tually find a pair of mutually optimal sequences (Si, Su). However, and this 
is the real difficulty, there may be many such pairs of sequences which satisfy 
this necessary condition for over-all optimality. 

Nevertheless, the above technique leads to a proof of the following useful 
result. 

Theorem 

A necessary condition for a reversal of order of consecutive items i, j on I to 
j, i on II in a pair of mutually optimal sequences (Si , S n ) is that 

(5) Mi > Mj + max (Ay, Bj). 

This is also a sufficient condition provided item i is not reversed with its pre¬ 
ceding item on I and item,/ is not reversed with its following item on II. 

Since U > tj, Mi > A } + Mj. Symmetrically, t'i > t'j or M t > Bj + Mj , 
giving (5). 

Now for simplicity assume there is only one M { satisfying (5) for several 
items j, k, l, say. The question of which item should be interchanged with item 
i seems to be too hard to answer by any simple decision rule. We propose to try 
each case and compare. Then if we try reversing i with j where 

Mj + Aj + Bj > Mt > Mj -f max (Ay, Bj), 
the two consecutive columns for items i and j in the time matrix 
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can be replaced by a single column 

Ai + Aj 
M j 

Bj + JSt- 

corresponding to single fictitious items, as far as computing total time is con¬ 
cerned. To see this consider Gantt Chart 1. Here Mj is the analogous time lag 
between the finish of the job pair (i, j) on I and the start of the job pair ( j 7 i) 
on II. 

Similarly, if 

(5") Mi > Mj + Aj + Bj , 

the new column is 



since from Gantt Chart 2 the analogous delay time is Mi — Aj — Bj. 

Thus we have replaced the reversed pair of items by a single fictitious item 
and now are left with a problem of the type solved by Mitten. All that is re¬ 
quired is to insert this “item” into its proper place in the sequence given by 
Mitten’s rule. 


A i 


M 1 


] 





M J \ B J 

B 1 


Gantt Chabt 2 
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Next we try each possible reversal with item i, and compute the total time 
for such optimally ordered sequences and compare. Note it may still be best to 
have no reversal for item i even though (5) is satisfied for item i and some 
item j since they may not be adjacent in the optimal sequencing. 

In general, there may be several pairs of items satisfying (5) or even combina¬ 
tions of 3 or more items calling for permutations of order from S T to S u . If 
there are not too many possible cases, each of these can be worked out and com¬ 
pared. 

Further analysis yields some dominance rules concerning which items should 
be interchanged but the results are too special to be of very much value. 
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VARIETY IN RETAILING* 

WILLIAM J. BAUMOL and EDWARD A. IDE 
Princeton University and Alder son and Sessions 

Many marketing problems which promise to be amenable to the techniques 
of operations research have apparently not been subjected to systematic analy¬ 
sis. This article is a first attempt at an analysis of one such area—the number 
of items stocked by a retailer and its relation to his sales, his costs, and his 
profits. 

The analysis of the relations among these variables permits the development 
of criteria for an optimal variety of items; that is, of expressions which can 
indicate to the retailer whether an increase or a reduction in the number of 
commodities, styles and brands which he offers for sale will enhance his profits. 
The discussion also throws some light on a number of well-known retailing 
phenomena like the growth of suburban shopping centers and supermarkets. 
By and large, these results are reassuring rather than startling. 

The tentative nature of the model cannot be overemphasized. Its structure 
has purposely been greatly simplified. In particular, linearity has been as¬ 
sumed wherever it does not seem to conflict directly with the properties which 
the expressions are intended to describe. It is, therefore, noteworthy how often 
non-linearities have imposed themselves on the model or have arisen out of the 
mathematical manipulations. 

I. Equilibrium of the Consumer 

1. The gains from increased variety 

A shopper does not know in advance (with certainty) whether he will obtain 
what he wants by entering a particular shop, i.e. whether it does or does not 
carry some of the items he desires. Generally, there will be one or several alterna¬ 
tive sets of items, the availability at acceptable prices of any one of which will 
make the shopping trip successful in the consumer’s view. The greater the num¬ 
ber of items carried by the store he enters, the greater, ordinarily, is the con¬ 
sumer’s reason for expecting that the shopping trip will in this sense be suc¬ 
cessful. Of course, this is only true so long as any additional items carried are 
not known to exclude all commodities desired by the consumer. For example, a 
known addition to a store’s line of paints will not help attract necktie shoppers. 

This can readily be translated into probabilistic terms. Let N be the number 
of different items, i.e., the number of varieties, sold by the retailer. Then we 

* This article is a product of the empirical and theoretical investigations carried out 
in connection with the Alderson & Sessions Basic Research Program. For a description of 
the other aspects of the program see: COST AND PROFIT OUTLOOK, Vol. IX, No. 2, 
February, 1966. COST AND PROFIT OUTLOOK, Vol. IX, No. 3, March, 1966. 
PRINTERS’ INK, January 20, 1966, p. 26. BUSINESS WEEK, November 12,1966, p. 68. 
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may write 

(1) p(N) 

for the probability that the consumer will find some set of items in the store 
which will make his trip successful. On the usual convention, we have 0 ^ 
p(N) S 1 where, for example, p(N) = 1 means certain foreknowledge of suc¬ 
cess. In the absence of specific customer information about the nature of the 
items carried in the store, p(N) will be close to unity only if the customer is 
easily satisfied or if N is very large. 

Since an increase in the number of items stocked is taken to increase the prob¬ 
ability of success in shopping, we also have dp/dN ^ 0. 

It must be emphasized that since we are here not primarily interested in the 
influence of prices and advertising, they are both assumed to remain unchanged 
throughout. In particular, they are taken to be unaffected by the number of 
items stocked by the retailer. Of course, this is not likely to occur in practice. 
The influence of both these variables is clear. When there is a decrease in prices 
or an increase in informative advertising, there will be an increase in the prob¬ 
ability of successful shopping trips, for consumers are then more likely to know 
which store carries the items they want and are more likely to find them of¬ 
fered for sale at acceptable prices. 

2. The costs of shopping 

In going to some particular store the customer incurs some costs. Some of 
these represent the cost and trouble of transportation. If the distance of the 
consumer from the store is D, we assume that for him these costs are strictly 
proportionate to D and are given by c d D where c d is a constant. 

Moreover, the difficulty of shopping increases with the number of items 
stocked by the store—the more items stocked the further we must walk to get 
to the spot where some items are kept. Roughly speaking, the average distance 
walked to an item may be expected to increase as the square root of the number 
of items carried by the store if it is all located on one story since area increases 
as the square of the radius of a circle or the length of the sides of a rectangle. 
For similar reasons, if the store operates with a multi-story building we might 
expect these costs to vary as the cube root of the number of items offered for 
sale. For our purposes, we shall assume that these costs are directly propor¬ 
tionate with the square root of the number of items stocked, and are given by 
CnVN . 

Finally, there are costs which do not vary with the number of items sold or 
the consumers’ distance from the store. Simply taking the initiative to shop 
involves time and effort as well as opportunity costs, including other shopping 
opportunities foregone. For example, a shopper knows that by spending the 
day shopping, she may be giving up a chance to catch up with her darning or 
to spend a quiet evening at home. For some who enjoy shopping this cost, C {, 
(and perhaps c n ) may be negative. It should be emphasized that c { is defined as 
a total, not an average cost, and includes the opportunity cost of foregoing 
other alternative shopping trips. 
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Thus the costs of shopping to the consumer are assumed to be given by the 
sum of these three classes of cost, i.e., by 

(2) c d D + c n y/N + Ci . 

3. The demand function 

Presumably the decision to shop or not to shop at a given retail outlet will 
result from a weighing of the probability of success as given by (1) against the 
costs of shopping (2). We assume that the typical consumer does this simply 
by assigning unconsciously subjective weights w and v (both of which are taken 
to be positive) to the two components and then seeing which is the larger. The 
constants v and w are viewed as being invariant over a collection of stores of¬ 
fering similar types of assortments. Here the relevant probability function p(N) 
is presumably subjective and its relation to the objective probability function 
is a matter for empirical investigation. 

Thus the consumer will not shop at this store unless for him 

(3) f(N> T>) = wp(N) — v(c d D + c n -\/N + c<) 

is positive. 1 The function f(N, D) is a measure of the consumer’s expected net 
benefit from entering the store in question and shows how this will vary with 
D, his distance from the store, and N, the number of items offered for sale there. 

We can now examine the effect of variation in the number of items carried. 
An increase in the variety of items handled by the store will involve an in¬ 
crease in the probability of success but it will also increase shopping costs. We 
must see how f(N, D ) will vary with N. Direct observation yields the following 
results: 

a) When the number of items stocked is small the function will be negative. 
Specifically, /(0, D) < 0 since, when nothing is stocked by the store, the prob¬ 
ability of success p( 0) = 0; and hence the term of which w is the coefficient, 
and which is ordinarily the only non-negative term in (3), becomes zero. This 
point amounts to the trivial observation that it does not pay to shop in an 
empty store. 

b) For very large values of N, f will also be negative since the first term in 
(3) can never 2 exceed w while vc n \/N grows indefinitely large. 

c) For intermediate values of N and small values of D, f will be positive if w 
is sufficiently large relative to v, i.e. if the probability of finding what he wants 
is weighted sufficiently highly by the customer relative to shopping cost. 

d) In this case if the expression is assumed to be continuous throughout, i.e., 
if p(N) is continuous, f must attain at least one maximum in the intermediate 
range. 

e) After / has passed its maximum, the term —vCn^/N will ultimately domi- 

1 Since a takes account of foregone opportunities to shop elsewhere (i.e., /(IV, D) is a 
measure of net benefits), the value of / will be negative for any store which does not offer 
the consumer maximum expected gross benefit. 

* Of course, this is really only an artifact resulting from the linearity of the expression 
and the constancy of the model’s coefficients but it seems also to be rather reasonable, 
especially in view of what follows. 
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nate the expression. Since this term decreases at a decreasing rate (positive 
second partial derivative with respect to N), this must eventually also be char¬ 
acteristic of /. 

f) Marginal and average values of / will be negative for low values of N. They 
will subsequently become positive and finally decline to a negative value again 
after reaching a maximum. 

4. Economic implications 

These conclusions have several rather common sense economic implications: 

a) Increased variety is an advantage to a consumer only up to a point. Ulti¬ 
mately a store may stock so large a variety of items that shopping costs become 
prohibitive. This suggests why Sears Roebuck might find it profitable to cata¬ 
logue many lines which Macy’s will not carry. By issuing separate catalogues 
for different lines some mail order houses have been able to reduce c n further 
and thereby have made an even larger N feasible. 

b) The minimum number of items necessary to induce a consumer to shop 
at a given store will increase with D, his distance from that store. This is simply 
the plausible assertion that the high shopping costs of a distant consumer can 
only be overcome by a high probability of a successful shopping trip. 

c) The optimum variety from the consumer's point of view, i.e., that value 
of N for which f(N, D) is a maximum, is independent of his distance from the 
retailer. This was assumed directly in the form postulated for the function /. 
For the term in f(N } D) which contains D does not contain N. This term will 
therefore drop out when we solve for the optimum N by setting bf/W = 0. 

d) For every value of N , there will be a maximum consumer distance from 
the store beyond which it will not pay this consumer to purchase from this 
shop. The net benefit is a function also of the place of residence of the con¬ 
sumer. Thus about each store and for a particular net benefit, i.e., for a particu- 
lar value of f(N , D ), we may conceive of a contour line or indifference curve 
associated with a locus of residence about the store. The mmimm shopping 
distance is given by the equation of the indifference curve which offers the con- 
sumer zero net benefit from shopping at this store. This maximum distance is 
obtained by setting f(N , D) = 0 (zero net benefit) and solving for D to yield 

(4) D m = ^ P m- 1 -(c n VN+e i ). 

VCd Cd 

More economic implications of our model will be indicated in the next section. 


II. The Retailer’s Demand Situation 

1. The aggregate demand function 

From the point of view of the retailer, a function very much like f(N, D ) 
may be taken to determine the proportion of the population which shops at his 
establishment. This may be related directly to sales in the following manner - 
Suppose, once a customer decides to shop at this particular store, the number 
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of items he buys is independent of the number of items available. This assump¬ 
tion is clearly false and we shall modify it later. This premise implies that the 
volume of a store’s sales will depend directly on the number of individuals 
who can be induced to shop there, i.e., on a relationship like (3). At any distance 
from the retailer, sales will, in the simplest circumstances, vary directly with the 
proportion of the population which decides to shop at this outlet. This is strictly 
in accord with our decision to employ linear assumptions wherever possible. 
In this case, the proportion of the population residing at a distance D from 
the store which will decide to shop at this store is given by a function similar 
in form to f(N } D ). Let us take capital letters to indicate the parameters in 
the new function analogous to those represented by lower case letters in /. The 
new function may then be written 

(5) F(N, D) = WP(N) - V(C d D + C n VN + <?<) = A (A) — VC d D , 

where A(N) = WP(N) — V(C n \/N + C»)* It should be noted that since 
VC d D does not vary with N both F(N, D ) and A{N) will be similarly affected 
by changes in the value of N. It should also be observed that while / and F 
are similar in form the latter should be derived independently from the data 
rather than from some process of aggregation of the fs of different shoppers. 

To determine the volume of Sales, we must also know something about the 
density of population in the area surrounding the store; that is, the distribution 
of population within the area whose boundaries are given by the relationship 
gotten by substituting the parameters of F for those of / in (4) and which lie 
within a distance D m from the store. We discuss only two very simple possi¬ 
bilities in line with our determination to simplify the model to the utmost: 

case i. population per square mile is everywhere given by the constant K so 
that Population within an area of Radius D a is ICtD*. 

case ii. the store is located at the point of greatest population concentration, 
and population density, K/D varies inversely with the distance from the re¬ 
tailer. The area lying within a distance D a from the store is tjD* 2 . The popula¬ 
tion within the circular area of radius D a is given by 

/.«-D a 2 jr » D a rr 

/ £ d Area = / £ 2irD dD = 2 irKD a . 

Jo D Jo D 


First consider case i. Here 


Sales = 



F(N, D)d Population 


f Dn 

- / [A (A) - VC d D]2irKD dD 
Jo 

r d 2 D z ~\ Dm 

=2^L A(Ar) T- Fc 4Jo 

= 2xKDj (^p- - VCd . 
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Now from (4) and (5) 


n 

” VC d ' 

Substituting this in our results yields’ 

(6) Sales = A(N)\ § - i) = J A(N )* = | VC d rKD„*. 

Turning now to case ii, 

f Dm 

Sales = / [4 (AO - FC,,Z)]2tA: d# 

Jo 


Thus 4 


2 tZ A(N)D 



(7) Sales = 2*iO> m (N) - A (AO 2 = VC d irKD m \ 

In both cases sales will vary directly with the maximum customer distance, 
in one case as the cube of that distance and in the other as the square of that 
distance. 


2. Economic implications 

These results also enable us to discuss the relationship between the expected 
sales of the retailer and the number of lines he offers for sale directly in terms of 
A(N). This, as can be seen from (5) will vary with N as does F(N, D ) or, by 
analogy, f(N, D) whose shape we have already analyzed. We are thereby led 
to a number of economic conclusions. 

a) An increased number of items will at first yield increasing average returns, 
then decreasing marginal and average returns. Finally, it will yield negative 
marginal returns. 

b) This means that, even neglecting considerations of retailer costs, it will 

3 This result can be made intuitively plausible as follows: A(N) is an index of any one 
customer’s inducement to purchase when he is located near the store. It also determines 
the maximum shopping distance—i.e. the radius D m is proportionate to A(N) —specifically 
D m « A(N)/VC d by (4). Because population is taken to be uniformly distributed, the 
number of persons in thf area will be proportionate to irDl. Since total sales equals sales 
per person times the number of persons, they will be given by a constant multiplied by 
A (N)vD m = A(N) 3 x/(VCd)* which is essentially our result. 

4 This implies that sales would be positive for very large or small values of N when 
A(N) is negative, which is clearly nonsense. This peculiarity arises because A(N), sales 
per customer would supposedly have negative values at these values of N whereas in fact 
sales can never be less than zero. It would be more appropriate (though the complication 
is not worth it) to employ instead of A(N) the function with discontinuous derivative 
given by A(N) for intermediate values of N but which is zero elsewhere. 
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not pay a store to proliferate limitlessly the variety of items it carries. There 
will be some maximum value of sales which can be found by setting the deriva¬ 
tives of (6) and (7) with respect to N equal to zero. 

c) The existence of a range of rising average sales deserves attention. This 
conclusion asserts that not only will total sales be increased up to a point by an 
increase in the variety stocked but the sales per item carried will also rise. In¬ 
creased variety will then be attracting a disproportionate number of additional 
customers. 

d) Our results are obviously in line with the common sense explanation of 
the reason why large retailers tend to locate at metropolitan centers since high 
population density means a large value of K. 

e) The results are consistent with the recent increase in emphasis on decen¬ 
tralized retailing and large suburban shopping centers. In terms of our model 
this can be accounted for in two ways—the increased movement of popula¬ 
tion toward the suburbs which involves a rise in the K pertaining to the sub¬ 
urban relative to the metropolitan dealers, and the increasing difficulty of 
driving into and parking in cities which in our model involves increases in Ch 
and Ci for the metropolitan dealers. This suggests that Cd and C, are themselves 
functions of population density. 

f) The analysis also fits in with the supermarket phenomenon in grocery 
retailing. The size of these giant stores can partly be explained by their rela¬ 
tively low C n —the relative ease with which a consumer can get at additional 
items which results from the layout permitted by their spaciousness and from 
their self-service arrangements. Parking lots offer a relatively low C*. The 
supermarkets’ methods of handling and prepackaging also reduces their in¬ 
ventory and handling costs, and this, as shall be seen in the next section, tends 
to make for a high value of N, the number of items stocked. 

Supermarkets are still increasing the number of items they handle and going 
into the sale of toiletries, housewares, clothing and appliances. However, it is 
our impression that no very great further increase in the number of items car¬ 
ried is to be expected in the absence of a marked autonomous or induced change 
in the value of the coefficients. 

3. Purchases per customer and the number of items stocked 

So far, we have retained the false assumption that purchases per consumer 
are independent of the number of items stocked by a retailer. Yet up to a point 
the more items stocked the more likely is a consumer to run into things he had 
not been planning to buy on this trip but which on being observed become 
irresistible. This may well serve as a partial offset to the ultimately diminishing 
returns to an increased number of items carried. But there would appear to be 
limits to this offset. The customer may not be able to look over more items in 
a very large store than in a moderately large store simply because of time limi¬ 
tations. After some point, further increases in N may then yield no further 
increases in sales per customer, though it is conceivable that this value of N is 
well beyond the relevant range. 
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III. The Profit Maximizing Variety 

1. The retailers’ costs 

To determine an optimum variety from the point of view of the retailers’ 
profits we must include in our model a discussion of the effects of changes in 
variety on his costs. These effects will primarily involve inventory and handling 
costs. 

In inventory theory, it is customary, as first approximation, to deal with 
inventory costs as follows: Let E be the mean expected sales volume of all 
commodities per period, r be the handling, clerical and other related costs of 
each reordering, T be the warehousing costs per item per period and 7 the 
quantity ordered for inventory each time stocks are replaced. Suppose, more¬ 
over, that inventory is replaced when, and only when, stocks on hand fall to 
level R. Then inventory costs per commodity are, on these simplest assump¬ 
tions, given by 

® r + (i +B ) T ■ 

The first term represents the cost of keeping the inventory replenished, for 
E/I is the number of times during a period that inventory will be depleted if 
sales go on at a steady rate. Since r is the cost of reordering once, then total 
reordering cost will be r multiplied by E/I, the number of times reordering will 
take place. The second term in (8) represents warehousing cost since the quan¬ 
tity of the commodity held in stock will vary between 7 + R and R so that 
the average level of inventory will be approximately (7/2) + R. 
t Costs can be minimized by picking an appropriate level of 7. Setting the first 
derivative of (8) equal to zero yields the well known result 7 = yj2Er/T. Sub¬ 
stituting this in (8) gives minimum 

cost per item = —+ i T + RT 

V ~T 

= V&TE + RT. 

In the simplest circumstances, the total cost of carrying N items will be equal 
to fixed costs, Q plus N times the cost of handling one item plus the additional 
costs resulting from the increased complexity of handling a variety of items. 
We may then take this cost to be given by 

(9) Q + Ny/2rTE + NRT + aVN 

where the last term is again given a square root form on the argument that the 
average distance to any one item will increase as the square root of the number 
of items so long as all handling takes place in a one-story building. 

We can combine (9) with our previous expression for E, the expected mean 

5 See, e.g., T. M. Whitin, The Theory of Inventory Management , Princeton, 1953, pp. 
31—33. 
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sales per commodity to obtain an expression for minimum retailing costs. Using 
(7) rather than (6), for illustrative purposes, total sales are wK/VCdA(N) 2 so 
that average sales will be this expression divided by N. _ 

Substituting this for E in (9) and writing b = \/2rKrT/VCd yields as the 
expression for minimum retailing cost 

(10) Q + b\/NA(N) + NBT + aVN. 

2. The retailer’s profits 

We may now obtain an expression for the retailer’s total profits by subtract¬ 
ing his total costs (10) from his total revenues which can be obtained by multi¬ 
plying the sales volume (7) by an appropriate price index. Let $ = (vK/VCd)p* 
where p* is the price index; then 

Profits = sA(Nf - Q - VN[bA(N) + a] - NRT. 

As we have seen, we may expect A(N) to have two positive roots at which 
points nothing will be sold and so profit will be negative. In between we may 
expect for reasonable values of the coefficients that the expression for profits 
will somewhere rise to a maximum which we can find by setting the first deriva¬ 
tive equal to zero. This will indicate the optimum variety in his merchandise 
from the point of view of the retailer. 

In particular, it is easy to see that the higher the handling and inventory 
costs, i.e., the higher a, r and T (and hence, the higher the value of b) the lower 
will be the optimal value of N, i.e. the smaller the variety it will pay to stock. 

As in many operations research analyses the results have been formulated in 
terms for which there is no simply obtained quantitative empirical counterpart. 
In applying results like these, improvisation and ingenuity will no doubt be 
required to obtain even approximations to the true parameters. Moreover, 
this very preliminary model will no doubt have to be modified and tailored 
case by case to fit the facts of the situation, and even then computed results 
will have to be interpreted and employed only with extreme caution. 
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AN AXIOMATIZATION OF UTILITY BASED ON THE 
NOTION OF UTILITY DIFFERENCES 1 

PATRICK STJPPES and MURIEL WINET 

Stanford University, Stanford , Cal. 

1. Introduction 

In the literature of economics (e.g., [1], [9], [14]) the notion of utility differences 
has been much discussed in connection with the theory of measurement of utility. 2 
However, to the best of our knowledge, no adequate axiomatization for this differ¬ 
ence notion has yet been given at a level of generality and precision comparable 
to the von Neumann and Morgenstern construction of a probabilistic scheme for 
measuring utility. (The early study of Wiener ([21]) is not axiomatically ori¬ 
ented.) The purpose of this paper is to present an axiomatization of this notion 
and to establish the expected representation theorem guaranteeing measurement 
unique up to a linear transformation. 

Recent experimental work by economists and psychologists (see the bibliog¬ 
raphy in [8]) suggests there are cogent reasons for reviving the notion of utility 
differences in order clearly to separate utility and subjective probability. The 
interaction between probability and utility makes it difficult to make unequivocal 
measurements of either one or the other. The recent Mosteller and Nogee ex¬ 
periments ([15]) may be interpreted as measuring utility if objective probabilities 
are assumed or as measuring subjective probabilities if utility is assumed linear 
in money. 

In [6] and [7] a detailed description is given of how utility may be experi¬ 
mentally measured by use of utility differences and a single chance event with 
subjective probability 

The scheme may be briefly described as follows. 3 Let E* be a chance event with 
subjective probability 14, and suppose that the individual we are testing prefers 
outcome x to y, and outcome z to w. We present him with two alternative gam¬ 
bles, one of which he must choose. Gamble 1 is that if E * occurs he gets x, and if 
E * does not occur he gets w; Gamble 2 is that if E* occurs he gets z, and if E* 
does not occur he gets y . It seems intuitively reasonable to say that the in¬ 
dividual should prefer Gamble 2 if and only if the utility difference between 
x and y is less than that between z and w. Once utility is measured by a procedure 
of this kind, we may measure subjective probabilities. (To some extent, this 
approach was anticipated in [17].) 

Since the chance event E* is fixed throughout the discussion, it does not play 

1 This work was supported in part by the Office of Ordnance Research, U. S. Army, and 
in part by the Stanford Value Theory Project. 

2 The formally similar notion of sensation differences is important in the literature of 
psychology, (e.g., [3], [12], [13], [19], [20].) 

3 The intuitive idea of this approach was primarily due to Professor Donald Davidson. 
It was suggested in [5] and has been the basis for the experiments reported in [6]. 
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any formal role in our automatization and enters only via one particular empirical 
interpretation of the notion of utility differences. Consequently, interpretations 
of our primitive notions, completely divorced from any probability questions, 
are available for analyzing other approaches to utility theory. A justification for 
considering alternative schemes is the limited applicability of the probabilistic 
approach just described. It can and has been used in some laboratory experiments 
at Stanford (see [6]), but it is far from clear that it can be seriously applied to 
market behavior. An interpretation of utility differences in terms of amounts of 
money is an obvious alternative. We present such a scheme in the form of a 
reduction sentence (the general character of reduction sentences is discussed in 
[2]). For simplicity we consider a fixed individual, say, Jones, and we assume 
that a prior satisfactory analysis of preference (as opposed to preference differ¬ 
ences) has already been given. 

(1) IF: (i) Jones prefers commodity x to commodity y, and commodity u to com¬ 
modity v , (ii) Jones has in his possession commodities y and v, and (iii) Jones is 
presented with the opportunity of paying money to replace y by x and v by u, THEN: 
the utility difference between x and y is at least as great as that between u and v if 
and only if Jones will pay at least as much money to replace y by x as to replace 
v by u. 

An obvious objection to (1) is that it has the effect, so often argued against, of 
meamring utility in terms of money. However, the only assumption needed for (1) 
is that the relation between amounts of money and utility differences is monotonic 
increasing. A linear relation is not required. In our opinion such a monotonicity 
assumption is very reasonable for a wide variety of persons and situations. 

An alternative reduction may easily be stated in terms of work. It should be 
clear that the choice of money or work is not meant to entail any special status 
for these two commodities. What is needed as a basis for constructing other 
reductions is simply the existence of a commodity flexible enough to serve in 
different situations and such that its marginal utility is either always positive or 
always negative in the situations under consideration. 

In view of the many complex issues involved in assessing the workability, even 
in principle, of such reductions, it may be more useful to describe a particular 
experimental set-up which could be used to measure utility differences. For rea¬ 
sons which will become obvious, this scheme would not be directly applicable to 
market behavior, but on the other hand it does not presuppose any fixed rela¬ 
tions between money and other commodities. 

For definiteness, we consider six household appliances of approximately the 
same monetary value, for instance, a mixer, a deluxe toaster, an electric broiler, 
a blender, a waffle iron and a waxer. A housewife who does not own any of the 
six is chosen as subject. Two of the appliances are selected at random and pre¬ 
sented to the housewife, say, the toaster and the waxer. She is then confronted 
with the choice of trading the toaster for the waffle iron, or the waxer for the 
blender. Presumably she will exchange the toaster for the waffle iron if and only 
if the utility difference between the waffle iron and the toaster is at least as great 
as the difference between the blender and the waxer (due account being taken of 
the algebraic sign of the difference). A sequence of such exchanges (repetitions 
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permitted) can easily be devised such that every utility difference is compared to 
every other. Our axioms specify for the set of choices sufficient ideal properties 
to guarantee the existence of a cardinal utility function. 4 

From another conceptual standpoint (as pointed out to us by our colleague, 
Professor Davidson), we may think of the housewife as expressing a simple 
preference between pairs of appliances. Thus if she trades the toaster for the 
waffle iron she has decided that she would rather have the pair (waffle iron, waxer) 
than the pair (toaster, blender). Put in these terms we are asking for a utility 
function <p of the Frisch and Fisher type ([10], [11]) such that one pair (x, y ) is 
preferred to another (u, v ) if and only if 

<p(x) + cp(y) > <p(u) + (p(v ). 

The existence of such a function is taken to mean that “utilities are independent,” 
that is, the co mm odities involved are neither complementary nor competitive 
with respect to each other. Viewed in this light, our axioms analyze the special 
conditions required for the existence of a cardinal utility function on a set of 
independent commodities. Whatever one’s a priori feelings about the plausibility 
of the independence hypothesis there can be little doubt that the experiment 
just described would provide a means of empirically testing the hypothesis, 5 and 
thus would satisfy Samuelson’s methodological demand ([18], p. 183): 

It may be argued that regarded purely as a working hypothesis the facts do not sharply 
contradict the independence assumption. A little investigation reveals that such a hy¬ 
pothesis has not been tested from this point of view. On the contrary, it is implicitly as¬ 
sumed from the beginning in the manipulation of the statistical data. Hence, one would 
have to go back to examine the original empirical data. 

It is interesting to note that the problem of complementarity occupies a position 
in this interpretation analagous to the position occupied by the problem of a 
specific utility of gambling in a probabilistic interpretation. 

It is also our opinion that many areas of economic and modern statistical theory 
do not warrant a behavioristic analysis of utility. In these domains, there seems 
little reason to be ashamed of direct appeals to introspection. For example, in 
welfare economics there are sound arguments for adopting a subjective view 
which would justify the determination of utility differences by introspective 
methods. Some psychological experiments on utility differences which essentially 
use introspective methods are reported in [4]. 

It is to be emphasized that the formal results presented in the remainder of 
this paper do not depend on any of the particular interpretations here proposed. 

2. Primitive and Defined Notions 

Our axiomatization is based on three primitive notions. The primitive K is a 
non-empty set, to be interpreted as a set of alternatives (objects, experiences, 

4 By considering just six items, we cannot get a realization of the axioms given in Section 
3. However, by increasing the number of items, we would presumably be able to get a suc¬ 
cessively closer approximation. 

5 Some experiments are planned in collaboration with Professor Davidson. 
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events, or decisions) available to a given individual at a given time. Tlie primi¬ 
tive Q is a binary relation whose field is K\ the interpretation of Q is that x Q y if 
and only if the individual does not prefer y to x. The third primitive is a quater¬ 
nary relation R whose field is also K. In the intended interpretation x, y R z, w 
if and only if the difference in preference between x and y is not greater than the 
difference in preference between z and w. 

Our axiomatization assumes a rather complicated form if it is given only in 
terms of our three primitives. It is intuitively desirable to use some defined no¬ 
tions whose interpretation follows directly from that of the primitives. 

Definition Dl. x I y if and only if x Q y and y Q x. Obviously, I is the relation 
of indifference. 

Definition D2. x P y if and only if not y Q x. The relation P is the relation of 
strict preference . 

Definition D3. x, y E z, w if and only if x,y R z, w and z,wRx, y. The inter¬ 
pretation of the quaternary relation E is that if x, y, z and w are alternatives, 
then x, y E z, w if and only if the difference in preference between x and y is 
equivalent to the difference in preference between z and w. 

Definition D4. x, y S z, w if and only if not z, w R x, y. Clearly, x, y S z, w if 
and only if the difference in preference between x and y is strictly less than the 
difference between z and w. 

Definition D5. B(y, x , z) if and only if either x P y and y P z, or z P y and 
yPx. The intuitive idea of betweenness is expressed by the relation B . 

The above notions suffice for the statement of all but the last axiom, the Archi¬ 
medean axiom. For the latter, one further quaternary relation is needed. 

Definition D6. x, y M z, w if and only if y I z and B(y, x, w) and x, y E z, w. 
The quaternary relation M appears to be a trivial specialization of the relation 
j E. To clarify this situation, we introduce the notion of powers of M. The second 
power of M, for example, is the relation M 2 such that x , y M 2 z,w if and only if 
there exist elements u and v such that x, y M u 7 v and u,vM z, w. The n th power 
of M is defined recursively: 

x, y M l z, w if and only if, x, y M z, w\ 

x , y M n z,w if and only if there exist elements u and v such that 
x , y M n ~ l u , v and u, v M z, w. 

The difference between powers of E and of M may be brought out by interpret¬ 
ing x, y , z, and w as points on a line. The interpretation of x, y M z z , w, for in¬ 
stance, is that the intervals (x, y) and (z, w) are of the same length, and there 
are two intervals of this length between y and z. Of special significance is the 
fact that the interval (s, w) is four times the length of (x, y). On the other hand, 
in the case of the relation E z no specific length relation may be inferred for inter¬ 
vals (x } w) and (x, y). 

As we shall see in Section 5, the proof of our representation theorem essentially 
depends on exploiting the properties of the powers of M. 
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3* Axioms 

Using our primitive and defined notions, we now state our axioms for difference 
structures. 

A system 3C - ( K,Q,R) will be said to be a DIFFERENCE STRUCTURE if 
the following eleven axioms are satisfied for every x, y, 2 , w, u, and v, in K: 

Axiom Al. xQy or y Q x; 

Axiom A2. If x Qy and y Q z then x Q z; 

Axiom A3, x, y R z, w or z, w R x, y; 

Axiom A4. If x, y R z, w and z, w R u, v then x, y R u, v; 

Axiom A5. x,yRy, x; 

Axiom A6. There is at in K such that x, t E t, y; 

Axiom A7. If x I y and x, z R u, v, then y,z R u, v; 

Axiom A8. If B(y , x, z) then x, y S x, z; 

Axiom A9. If B(y , x, z) and B(w , u , v) and x,yRu,w and y,zRw, v , then 
x, z R u, v; 

Axiom A10. If x,y S u,v then there is a t in K such that B{t , u, v) and x,y Ru, t; 

Axiom All. If x, y R u, v and not x I y, then there are elements s and t in K 
and a positive integer n such that u, s M n t, v and u, s R x, y. 

The interpretation of Axioms A1-A4 is obvious. Axiom A5 expresses a com¬ 
mutativity property of R and means essentially that for pairs of elements to 
stand in the relation R only their differences matter and not their relative order. 

Axiom A6 means intuitively that between any two elements of K, there is a 
midpoint. This axiom represents a more reasonable assumption than, for instance, 
a formulation requiring that between any two elements there exist an element 
some arbitrary part, say of the distance between them. Indeed, the axiom 
as here stated, receives empirical corroboration in the field of psychology from 
the practice of “fractionation” and “bisection” experiments requiring the subject 
to select the tones in just the way described, and from the existence of laboratory 
equipment designed for such experimental use. (See, e.g., [19] and [20].) Also, 
the probabilistic experiments ([6]) described in the first section have demon¬ 
strated the practicality of finding such midpoints. 

Axiom A10 means that if the difference between x and y is less than that 
between u and v, then there is an element t of K between u and v and the differ¬ 
ence between x and y is not greater than the difference between u and t. 

Axiom All, the Archimedean axiom, means that if the difference between a; and 
y is not greater than that between u and v, and if a; is not indifferent to y, then 
there are n elements of K equally spaced in utility between u and v such that 
the difference between any consecutive two of these elements is not greater than 
the difference between x and y. 

4. Elementary Theorems 

A rather large number of elementary theorems is required, for the complete 
proof of our representation theorem for difference structures. In the present 
paper, however, we are concerned merely to sketch the main outlines of such a 
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proof; and, for this purpose, it will.be sufficient in this section to present defini¬ 
tions of certain relations, not needed for stating the axioms, but used in a key 
way to develop the required proof; and to state without proof several elementary 
theorems which describe typical properties of the relations defined, or which 
figure centrally in the sketched proof of the representation theorem. In particu¬ 
lar, we omit completely a large group of theorems which develops the expected 
properties of Q and R and of the other simple “qualitative” relations (I, P, P, S, 
B) described in Section 2. 

We first introduce the notion of the quaternary relation N (a). 

Definition D7. N(a) is the quaternary relation defined as follows 

i ) if a = 1, then x, y N(a) u, v if and only if x I u and y I v 

ii) if a t* 1, then x, y N(a) u , v if and only if x I u and there exists a z such that 
x, y M a ~ l z, v. 

The interpretation of N( 1), of course, is obvious. To say for a 9* 1, that x , y 
N(a) u, v means that x and u coincide, and that there are a — 1 equally spaced 
elements of K between u and v such that the difference between any two of them 
equals the difference between x and y. If x, y, u and v are interpreted as points 
on a line, this notion obviously corresponds to the intuitive notion of “laying off” 
an interval on another interval; that is, we interpret x, y N(a) u, v intuitively as 
meaning that if we start from u , and “lay off” an interval of the length (x, y) a 
times in the appropriate direction, we obtain the interval (u, v) . By means of the 
N(a) relation, therefore, we are able to express the quantitative fact that the 
length of an interval (u, v) is a times the length of a subinterval (x, y). 

The sort of “multiplication” of intervals characterized by the N(a) relation 
possesses the expected properties; for example, we have the following theorem 
concerning ratios of intervals. 

Theorem 1. If x, z N(a) x, y and x, z N(ab) x, w then x, y N(b ) x, w. 

Another theorem involving the N(a) relation generalizes A6 and may be justi¬ 
fied along similar lines. Characteristic of our system, it asserts that appropriate 
elements exist for dividing any interval into powers of 2. 

Theorem 2. If not x I y then there is a z such that x, z N(2 m ) x, y. 

Further N (a)-theorems state properties of “AT-multiplication” for powers of 2. 
We have, for example, the usual law for addition of exponents: 

Theorem 3. If x, w N (2 m ) x, z and x, z N( 2 W ) x, y then x,w N (2 tn+n ) x, y. 

A crucial, but less obvious property is stated in the following theorem. 

Theorem 4. If B(y, x, z) and x, t N( 2 m ) x, y and y , s N(2 m ) y , z and x, r N( 2 ) x, 
z then t, r E y, s . 

We now define a relation in terms of which most of the proof of the representa¬ 
tion theorem is carried through. 

Definition D9. H(m } a; n, h) is the quaternary relation such that x, y H(m , a; n, b ) 
u, v if and only if there are elements z \, Z 2 , w± and w% such that x, Zi N (2 ) x, y and 
u , wi N( 2 n ) u , v and x, zi N(a ) x, z 2 and u , wi N(b ) u , w 2 and x, z 2 R u,W 2 . 

To say that x, y H(m, a; n, b ) v means intuitively that an (a/2 m ) t part of 
the interval (x, y) is not greater than a (b/2 n ) ih part of the interval (u, v). 

We may view our first theorem on this notion as enabling us to specify a partial 
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bound for the values of arguments satisfying the ^-relation between two inter- 
vals. 

Theorem b.Ifnotxly and x, y H(m, a; n, b ) u, v, then not u, v H(n, b; m + 1, 
d) x , y . ’ 

Since the -relation can be thought of intuitively as a special sort of inequality 
we would expect to be able to prove many of the laws governing inequalities! 
Thus Theorem 6 expresses a kind of transitivity property and Theorem 7 an in¬ 
tuitively simple conservation property. Theorems 8, 9, 10 and 11 assert cancella- 
txon and multiplication laws. 

Theorem 6. If x, y H(m, a ; n, b) u, vandu,v H(n, b; p, c) r, s then x, y H(m, a; 
Vj w 

Theorem 7. If not x, y H(m, a;n,b)u,v and w, z H(p, c; n, b)u, vandal 2 m 
and not x I y, then not x, y H(m, a; p, c ) w, z. 

Theorem 8. If x, y H (m, a;n,b)u,v and ac ^ 2 m and be £ 2”, then x, y H(m 
ac; n, be) u,v. x ’ 

Theorem 9 . If x, y H(m, ac; n, be) u, v, then x, y H(m, a; n, b) u, v. 

Theorem 10. If x, y H{m, a;n,b)u,v and either m ^ 0 or not x I y, then x, y 
H(m + c,a;n+ c, b) u, v. 

Theorem 11. If x, y Him + c, a; n + c, b) u, v and a ^ 2” and h < 2 n then 
x, y H{m, a; n, b) u, v. ~ 

Theorem 12 states an addition property for the arguments of the H-relation in 
the case of adjacent intervals. 

Theorem 12 If B(y x, z) and a + b ^ 2" and x, y H(m, 1 ; n, a) u, v and y, 
z H(m, 1; n, b) u, v then x, z H(m, 1; n, a + b) u, v. e 

rr i Fma H y ’ WC State , tW ° existence theorems for arguments of the ^-relation. 
These theorems are the form in which we make use of our purely qualitative con- 
tinuity axiom (A10) and our Archimedean axiom (All) respectively. 

Theorem 13. If x, y S u, v, then there are integers b and n such that b < 2” and 
x, y H{ 0, 1; n, b) u, v. 

Theorem 14. If not u I v, then there is an integer m such that x, y H(m, 1; 0,1) m, 


v. 


5. Representation Theorem 

Our desired representation theorem is an immediate consequence of the follow¬ 
ing lemma. (As a matter of fact, it is rather customary in the theory of measure¬ 
ment to label a lemma of this sort the “theorem of adequacy” and not to state 
explicitly a representation theorem. Cf., e.g., [16], pp 24-29 ) 

Fundamental Lemma. Let X = (K,Q,R) be a difference structure. Then: 

A) There exists a real-valued function 4> defined on K such that for every x, y, z w 
m * * 

i) z Qy if and only if <f>(x) S <t>(y), and 

ii) x, y R z,w if and only if | <j>(x) - <t,(y) | ^ | *(*) _ ^w) \. 

BUf & and <f> 2 are any two functions satisfying (A), then there exist reed numbers 
a and vnth a > 0 such that for every x in K, 4>i(x) = caj> 2 (x ) + /3. 

• We are indebted to Professor Herman Rubin for the proof of this theorem. 
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Proof : Part A. We begin by choosing two elements u and v in K such that 
u P v (if no such two elements exist, the proof is trivial). We next define for 
x and ymK the set of numbers S(x } y\u, v). A rational number r is in $(x, y\u 7 v) 
if and only if there are non-negative integers m and n and a positive integer b 
such that b S 2 n and r = (b2 m )/2 n and x , y H(m , 1; n, b) u, v. 

Let r and r' be positive rational numbers. Using Theorems 8, 10, 6, 9 and 11, 
in that order, we may easily prove that 

(1) If r e S(z, y; u, v ) and r < r f then r* e $(x, y ; a). 

Using now principally Theorem 14 and Theorem 5 we may show that if not 
x I y then the set $(z, y; u, v ) has a positive number as a lower bound. Since by 
Theorem 14 $(z, y\u, v ) is not empty, we conclude that it has a greatest lower 
bound. We use this fact to define the function /(*,„)'• 

f(u,v)(x, y) is the greatest lower bound of S(x, y;u,v). 

Obvious arguments prove that 

/(*, v)(x, y) = 0 if and only if xly 


and 


/(t*.®)(^> *0 I? 

the choice of (w, v) thus corresponding to choice of a unit of length. 

We obtain by an indirect argument from (1) that for any rational number r 

(2) If /(*.•) (s, y) < r then r e S(x, y;u,v), 
and we are in a position to establish: 

(3) If x, y R z, w then /(«,*) (x, y) S f(u,v)(z , w). 

(The proof is trivial in case xly; hence we assume: not x I y.) Suppose, if pos¬ 
sible, that/(„,*)(z, w) < f( U ,v)(x, 2/). Then there are integers m, n, b such that 

b2 m 

f(u t v)(z 7 'lo) ^ /(...)(*, 2 /)* 


From (2) we then obtain: 

z,w H(m, 1; n, b) u, v and not x, y H(m, l;n,b) u> v. 

Hence by Theorem 7, not x,yH(m , 1; m, 1) z, w 9 and thus by Theorem 10 and 
D9, not x,yRz, w, which contradicts the hypothesis of (3). 

We next prove: 

(4) If f(u,v)(x, y) S f(u, v)(% y w) then x, y R z, w . 

Let 

b 2 mi 

r _1_ be in $(x, y; u , v) 
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and let 


Q = 


6 2 2 ms 

2 n * 


be in S(z, w; x, z). 


Then we have: h ^ 2 n \ b 2 £ 2 "’, *, y H(m x , l;n 1; M «, 8> and *, w H(m 2 , 1 ; 
ni > x > V- Hence by Theorem 10 , Theorem 8 , and Theorem 6 , z, w H(m x + nh, 
1; n x + , 6162 ) u, v. We conclude that ’ 


(5) rq 

Now for the moment let 


is in S(z, w; u, v). 


« = f(u,v>(x, y) 

P ~ f(x.v)(z, W) 

7 = /(«,») (z, w ). 

Suppose, if possible, that a/3 < y. Then there is a positive e such that (a + e)- 
(0 + e) = y. Clearly we may choose a number r in the open interval (a a+e) 
and a number q in the open interval (0, /3 + e) such that r is in $(x, y, u, v) and 

? is m §0, w; x, y). Since rq < 7, rq is not in S(z, w; u, v), but this contradicts 
( 0 ), and we conclude that 

^ /(«.«)fe y) -U,v)(z, U?) ^ f(u,v)(z, w). 

Suppose now that not x, y R z, w. By Theorem 13 it follows that there is an n 
and a 6 with 6/2” < 1 such that z, u J?(0, 1; n, 6) *, y, and we conclude that 
/cx y)(z, w) < 1. Combined with (6), this result gives us :f iu , v) (x, y) > f (u>v) ( z , 
which contradicts our hypothesis, completing the proof of (4). 

We now define the function $(„,*) as follows. For every x in K, 


<t>(u,v)(x) = 


f(u,v)(u, x), if u Q x. 
*/(«,»)(+ x),iixQ u. 


We see at once that 4w)(u) = 0 , and thus our choice of u corresponds to the 
choice of an origin. (3) and (4) provide the basis for an obvious proof that 

(7) xQyif and only if <t> (u , v) (x) £ (j> (u , v) (y). 

To complete the proof of Part A we need to show that 

( 8 ) x, y R z, w if and only if | *.„>(*) ~ | 1*<*.., (*) ~ 4>c^(w) |. 

From (3) and (4) we see at once that it will be sufficient to prove 

^ f<M,v)(x, y) = 1 4>( U ,v)(x) — 4>( u ,v)(y) |. 

Of the five possible cases that need to be considered for (9) we consider only the 
typical one where x P y and y P u. For this case we must prove: 


(10) 


fiu,v)(x, y ) + /(„,„)(«, y ) = /( U , r) (w, x). 
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Suppose, if possible, that 

f(u,v)y) "H y) ^ f t)■ 

Then clearly there are integers m, n, b, bi , 62 such that 

62 ” 


(ID 


/(u,d)0e, 2 /) "b y) ^ 

, , w 6i2 ra 

, , , . h2 m 

ft **)(«, y) <-xr, 


and 


6 = 61 + 62 ^ 2 ". 


By (2) we have: x,yH(m, 1; w, 61) w, v and y, u H(m, 1; w, 62) «, ». Hence by 
Theorem 12 , x, « H(m, 1 ; », 61 + 62) «, 0, but from ( 11 ), we infer: not x, u 
H(m, 1 ; n, 61 + 62) u, v. On the basis of this contradiction, we conclude that 

(12) ft.u,v)(x, y) + f(uAu, y) ^ f<u.v)(u, x), 

and by an argument similar to the above we may show that equality holds in 
(12), thus establishing ( 9 ) for a typical case, and completing the proof of Part A. 

Part B. Using elements u and v in K as in the proof of (A), we define func¬ 
tions h and h 2 for every x in K by the equations: 


h(x) = 


4> iQc) ~ 4>i(u) 

<t>i(y) — <t>i(u) 


, / \ _ <h( x ) ~ <fo(w) 

hAX) ~ <h(.v) - Mu) ’ 

where <j> 1 and <$n are functions satisfying (A). Since u P v, we see at once that 


hi(u) = h 2 (u) = 0 


hi(v) = fa(v) = 1, 

and that hi and h 2 satisfy (A). Thus in order to establish (B) it will be sufficient 
to prove that 

(1) hi = h . 

We give the proof for the case where uP x and x P v. Suppose, if possible, 
that hi{x) ^ h(x). For definiteness, let hi(x) < h 2 (x). Then there is a positive 
e such that 


(2) h(x) = hi(x) + e. 

We now consider the smallest integer, say, n*, such that < e. (Since h(x) 
and h%(x) are both between 0 and 1 , n* ^ 0 .) By Theorem 2 there exists an 
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element, say, 2 *, such that u, z*N( 2"*) u, v. A simple argument shows that we 
must have: z* P x. 

Suppose now that there is an integer a such that u, z* N(a) u, x. It is easy to 
prove by induction that we must then be able to infer: 

(3) h(x) = h 2 (x) = a/T\ 

which contradicts (2). 

Since on the supposition of (2) there is no such integer a, there must be an inte¬ 
ger b and elements zi and z 2 such that 

u, z* N(b ) u, zi 

^ u y z* N(b + 1) u, z 2 

ZiP x 
xPz 2 . 

Using the induction which yielded (3), we have from (4), 

h(z 2 ) - hizO = gt* ~ TjfT* < e, 
and we also obtain from (4): 

h(zi) < hi(x) < hi(z 2 ) 
h(zi) < h 2 (x) < h 2 {z%). 

Combining inequalities we conclude: 

h 2 (x) hi(x) < h 2 (z 2 ) — hi(zi) < e, 
which contradicts (2). 

The proof of (1) is completed by a consideration of the four other possible cases 
for the position of x with respect to u and v. (Two of the cases are trivial* u I x 
and v I x.) Since (1) establishes (B), the proof of our lemma is finished. 

We would not expect to have a strict isomorphism between an arbitrary differ¬ 
ence structure X = (K, Q, R) and some numerical structure, since distinct 
elements which stand in the relation I are assigned the same number. However 
by considering the coset algebra X/I = (K/I, Q/I, R/I) 0 f 3C under I, we may 
easily establish such an isomorphism. (Since I is obviously a congruence relation 
on K with respect to Q and R, it should be clear that K/I is the set of all /-equiva¬ 
lence classes and that Q/I and R/I are the relations between equivalence classes 
corresponding to Q and R.) 

We define the quaternary relation T for real numbers as follows: 

if a , A 7, and S are real numbers, then a, /3 T y, S 
if and only if|a — — j |. 

Let N be a set of real numbers. Then we call an ordered triple (N, T) a 
numerical difference structure if N is closed under the formation of mid-points, 
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Le., if a, ft are in N, then (a + iS/2) is in N. We then obtain the following repre¬ 
sentation theorem as an immediate consequence of our lemma. _ 

Representation Theorem. If X = (K, Q, R)is a difference structure, then X/I - 
IK/1, Q/I, R/I) is isomorphic to a numerical difference structure. Moreover any 
two numerical difference structures isomorphic to X/I are related by a linear trans- 
formation. 
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AGGREGATION OF UTILITY FUNCTIONS* 1 

E. EISENBEEG 

Hughes Aircraft Company , Malibu , California 

We are primarily concerned here with the question of integrability of the 
total demand in a model in which each consumer acts according to a cardinal 
utility function and has a fixed monetary income. It is well known that con¬ 
cavity of the various utilities is not sufficient to guarantee integrability, nor 
even to ensure rationality of the revealed preference. We show that if each 
personal utility function is homogeneous, in addition to satisfying the usual 
repilarity conditions, then an aggregate utility function can be defined ex¬ 
plicitly in terms of the given utilities. Furthermore, under the same assump¬ 
tions we give a new characterization of equilibrium and show that equilibrium 
satisfactions are unique. 


1. Introduction 

The economic model to be considered is the following: In a market there are m 
consumers, each competing for one or more of a set of n commodities. Every 
consumer has a fixed positive monetary income 2 while production is outside the 
model, i.e., we do not consider negative quantities of the goods in question. A 
consumer’s behavior is completely described by a personal utility function; 
when prices for the various goods are given, he will act so as to maximize his 
utility subject to his budget restriction. One of the first questions that arise is 

i) For each set of prices we have the total demand. Can this demand be 
thought of as expressing the behavior of a single (fictitious) consumer acting 
according to a well-defined (aggregate) utility function? 

The existence of an aggregate utility function would tell us, among other things, 
that the community revealed preference is rational, i.e., if a bundle x is preferred to 
a bundle y and simultaneously y is preferred to x, then y is demanded whenever x 
is demanded and y can be purchased. In section IV we demonstrate a model for 
which the revealed preference is not rational, so that an aggregate utility func¬ 
tion need not always exist. 

Under certain circumstances the answer to i) is, as evidenced by relation (20) 

* Received May 1960. 

1 This paper is based partly on results obtained while the author was a consultant with 
The RAND Corporation. 

2 In the classical exchange model each consumer holds a fixed commodity bundle for 
trading purposes. It can be shown that a fixed monetary income model can be thought of 
as a special case of the exchange model. The specialization is: If a consumer holds two com¬ 
modities in quantities { and 17 , respectively, while another consumer holds {' and y' units 
of the same commodities then £ 7 ' = £' 77 . 

3 If a demand is given as a function of prices then the bundle x is (weakly) preferred to 

the bundle y providing there exists bundles ,■■■ ,x t (k 2 ) and price vectors p lt ■■■ , 
Pk-i such that Xi is demanded at prices pc, xi +i can be purchased at prices p, (i = 1, • • • 
k — 1 ) and x\ — x, Xk = y. ’ 
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below, closely related to the existence of equlibrium prices 4 * and it is relevant 
to ask: 

ii) Does equilibrium exist for every supply? 

It is a well known fact that any equilibrium distribution of the available supply 
is optimal (or “efficient”) in the sense that by redistributing the available supply 
one cannot increase the utility of one consumer without decreasing the utility 
of some other consumer. Equilibrium prices need not, however, be unique and a 
given consumer’s satisfaction (the value of his utility) may vary from one set 
of equilibrium prices to another. Consumers would then find it advantageous 
to enter into coalitions that would improve their position at the expense of other 
consumers. This difficulty in analyzing the market’s stability can be settled by 
the following question: 

iii) Are consumers’ satisfactions unique at equilibrium? 

For practical reasons we ask: 

iv) How can equilibrium quantities be computed? 

We propose to examine the preceding questions when all personal utilities are 
concave, homogeneous (of order 1), non-negative, continuous and non-constant. 
Theorems 1, 2 and3 provide affirmative answers to iii), i) and ii), respectively, 6 
although for ii) we require an essential hypothesis that the supply be positive. 
At the same time, iv) is answered (again for positive supply) by charac¬ 
terizing equilibrium quantities as solutions to a convex-homogeneous program¬ 
ming problem (Theorem 4). We have not, as would seem desirable, given neces¬ 
sary and sufficient conditions for the existence of an aggregate utility (nor 
for the existence of equilibrium). It can be said, however, that with very slight 
changes of the assumptions given above, models in which the revealed preference 
is not rational can be constructed and thus an aggregate utility does not exist 
(see Example 1 in Section IY). 

Before proceeding further, we wish to remark about the assumptions im¬ 
posed on each personal utility. Certainly all but concavity and homogeneity are 
self-explanatory. The assumption of homogeneity is probably the less realistic 
of the two; however, we may think of it as a second approximation. As a first 
approximation, it is frequently assumed that the utility functions are linear, 
simply because the theory of linear inequalities is so well developed and be¬ 
cause an arbitrary concave function may be approximated by linear ones. 
Homogeneity, then, is a far more flexible assumption and no doubt is satisfied 
exactly in many situations. As for concavity, it is equivalent to super-additivity 
(once homogeneity is assumed). This is just the so-called “Wholesale Principle, 
where by grouping two or more orders together we obtain a utility which is no 

4 For a given supply a set of prices are called equilibrium providing the supply can meet 

the demand and all surplus commodities have zero prices (i.e., are free). 

6 The existence of equilibrium is demonstrated under less restrictive assumptions in [1] 

and [6]. 
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less than the sum total of the utilities of the orders separately. Thus, to a man 
on a deserted road, for example, the value of a functioning automobile and 10 
gallons of gasoline will usually be higher than the utility of an automobile with¬ 
out gasoline plus the utility of the 10 gallons of gasoline alone. 

It is noteworthy that, unlike some treatments of Paretoan theory, we do not 
require personal utilities to be differentiable, although concavity implies dif¬ 
ferentiability almost everywhere (see ( 5 ]). By asking that utility functions be 
eveiywhere differentiable, such “reasonable” functions as the minimum of 
linear (homogeneous) functions would be excluded; the latter, however, meet 
all requirements imposed in this paper. 

2, Definitions 

Our model will be described within the framework of the theory of convex 
inequalities m a finite dimensional Euclidean space. R n will denote the set of all 
rea n-tuples, and R + will denote the set of those n-tuples having each coordinate 
nonnegative. We shall use the customary matrix notation; i.e., if A and B are 
\r x * . n ma ^ ces > respectively, then AB is the usual matrix product. 

Vector inequalities will mean that the inequality in question holds for each 
component. If/ is a function from R + n to R (denoted f:R + n R) then / is 
concave pnmdmg/[Xx + (1 - X),] ^ X/(x) + (1 - x)f(y) for allx, ye E + " 

l e R+ ls t0 be r( f d X1S& member of £+"”) and X e [ 0 , 1 ], / is (positively) 
homogeneous, providing /(Xx) = X/(x) whenever * € R + * and X € R + . Con¬ 
tinuity of / is with respect to the usual metric, i.e, if for x £ R n we denote the 
eal number Vxr by || x J|, then/is continuous providing that for each sequence 

x k € R+ (k = 1, 2, • ••) 

such that || % - x || converges to zero for some x € R + n , it is also true that the 

The sta * ement “ x “ lhe »* of •» 
such that p holds is abbreviated, as usual, to “X = {x | p} ” 

In the Mowing discussion, we shah repeatedly apply the‘following duality 
theorem of homogeneous programming (see [3]). 

Lemma 1. Let A be an n X k matrix, <i>:R + n - R, f-R + k -> R, where <t> and 
t are homogeneous and contmuous, <j> is concave, and ^ is convex. Let 

X — {x| x € R + and xAy S \f/(y) whenever y £ 

^ ~ {y \ y € R+ and xAy A tf>(x) whenever x 6 R+ n ] 

and consider the statements 

(a x ) if y e R+ , Ay A 0, $(y) g o, then y = 0 ; 

Oh) if x 6 R+ n , xA g 0 , <f>( x ) A 0 , then x = 0 ; 

(fl,) there is an 2:0 6 R + su ch that x„ Ay < *(y) whenever y £ R + k and y * 0 ; 
(h) there is a y 0 € R+ such that xAy 0 > 4>(x) whenever x 6 R + n and a ^ 0 , 
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The conclusions of the theorem are 

(i) (ai) and (a 2 ) are equivalent; 

( 61 ) and (b 2 ) are equivalent; 

(ii) if xq e X maximizes <j> on X and (ai) holds, then there is a yo e Y which 
minimizes ^ on F and *A(p 0 ) = 4>(xo) = xoAyo ; 

(iii) if 2/0 £ F minimizes ^ on F and (&i) holds, then there is an x 0 £ X which 
maximizes <£ on X and <K#o) = ^ 0 ( 2 / 0 ) = ^oA?/o. 

We are now in a position to give a precise description of the model with which 
we shall be concerned. There are m consumers denoted Bi , • • * , B m competing 
for n goods denoted (?i, * * * , G n . The consumer Bi has a positive income of ft 
units of money, and we assume that the monetary unit has been chosen so that 
ft = 1. The behavior of B i is characterized by a personal utility function 
Ui , where UiiR + n —» E + and each Ui is assumed to be concave, homogeneous, 
continuous, and nonconstant. Thus, if faced with a situation where the cost of 
one unit of Gj is tj , Bi will demand any order x = (£i, * * • , £») 6 R+ n which 
maximizes his utility subject to his budget constraint xp = ’X}/=i fpry = ft. 
More formally, the demand set of Bi at pnces p = (xi, * * * , tJ 6 is 
defined by 

( 1 ) ift(p) = {# | # maximizes Ui subject to x £ R + n and xp ^ ft}. 

Note that Ift(p) may consist of a single element, or an infinite number of ele¬ 
ments (although we see readily that it is convex then), or it may even be empty 
(as for instance when p = 0). We do know that if p > 0 then, Ui being con¬ 
tinuous, Di(p) is nonempty. 

As an immediate application of Lemma 1 , we can prove that Xi £ Z);(p) if 
and only if there exists a positive real number ai such that 

(i) Xi £ R+ n 

( 2 ) (ii) aiX{p = a$i = Ui{xi) 

(iii) aixp ^ Ui(x) whenever x £ R + n . 


First, if (2) holds then Xi £ R+ n , Xip = Pi , and for any re 6 i? + n such that 
xp S Pi, we have Ui(x) ^ a»xp g aift = Ui(xi). Thus, t£i(:c) ^ ^(rc*) and 
€ Di{p). On the other hand, if Xi £ ift(p) then certainly Xi 6 2? + n ; further¬ 
more, if in Lemma 1 we let k — 1, A = p, <£(#) = ufx), and ^(p) = fty, then 
(ai) is obviously satisfied so that (ii) of that lemma holds with Xi = x 0 . Letting 
yo = ai , we obtain (2). 

In terms of the Ift(p) we now define for every p £ the community (or 
total) demand D(p): 


D(p) = | x | there exist x x , • • • , x m 

m \ 

with Xi £ Di(p),i — 1, • • • , m, and x = X) • 


(3) 
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If Xi € R+ for i = 1, 2, • • • , m, then xi, • • • , x m is called a distribution. 
For a given supply s = (ci, • • • , <r n ), where <ry denotes the supply of Gy, the 
distribution xi , • • • , x m is said to be a. feasible distribution, providing V.TLi x> g 
s. The feasible distribution xi , ■ ■ ■ , x m is said to be an equilibrium distribution 
(for s), providing there exists a price vector p £ R + n such that Xi 6 Dfp) 
for every i and sp g 1 (the last inequality expresses the fact that at equilibrium 
only free goods can be in oversupply); p is then called an equilibrium price vector 
(for s). As a consequence of Lemma 6, it follows that x £ D(p) if and only if 
V js an equilibrium price vector for x. Thus the problem of finding equilibrium 
prices is simply that of inverting” the set-valued function Z)<; unfortunately 
the “simplicity” is theoretical rather than practical. 

The feasible distribution x?, ■ ■ ■ , x m ° is said to be a maximal distribution 
(for s '), providing it maximizes the social-welfare function $ on the set of all 
feasible distributions, where $ is defined for every distribution, xi , • • • , x n by 

m 

(4) ^( 2 = 1 , ,x m ) - n Ui(xi) 0t . 

i =1 

If u:R+ n —» R and for p 6 R + n and we define 

(5) D u (p) = {x | x maximizes u subject to x 6 R + n and xp <* 1}, 

then u is called an aggregate (or social) utility function providing D u (p) = D(p) 
for every p £ R+ n . 

3. Statement and Proof of Principal Results 

We shall prove the following: 

Theorem 1. For a given supply s, eqiiilibrium satisfactions are unique; i.e., 
^ Xl ’ ''' > x m and x\ , • * • , x' m are two equilibrium distributions, then for each 
i we have Ui(xi) = 

Theorem 2. There exists an aggregate utility function which is concave, 
homogeneous, and continuous. 

Theorem 3. Maximality Principle . 

(i) Every equilibrium distribution is a maximal distribution. 

(ii) If the supply is positive (s > 0), then every maximal distribution is an 
equilibrium distribution. 

Theorem 4. For a given supply s, consider the following problem: 

Minimize qs subject to 

(6) m m 

Q £ R+ and qYjVi^ JJ for all distributions y x , • • * , y m , 

4=1 4=1 

We conclude that 

(i) if q solves (6) and s > 0, then qs > 0 and p = (1 /qs)q is an equilibrium 
price vector; 

(ii) if p is an equilibrium price vector with associated equilibrium distribu- 

tion xi , • ■ ■ , x m and X = I[”-i «<(*<)*, then X > 0 and q = X® solves 
(6). 
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Remark . In terms of Lemma 1 , Theorem 4 can be interpreted by using the 
“dual” of ( 6 ), which is 


( 7 ) 


maximize 


m 


n Ui(xi) 0i 


subject to 


Xi € R+ n , for all i, and 


m 


22 Xi ^ s. 
*—1 


This is precisely the condition for a maximal distribution; hence Theorem 4 is in a 
sense the dual of Theorem 3. 

We first prove Theorem 1 . Let x% , * • • , x m and x\ , * * * , x' m be two equi¬ 
librium distributions, and let p, p 7 be their associated equilibrium price-vectors. 
If ai , • • • , a m and a\ , • * * , a! m are the positive numbers given by ( 2 ), then 

oux\p ;> Ui(x f i) = c!£i 

( 8 ) i = 1 , • * • , m 

a'ipCip' ^ Ui(xi) = afii . 

Also, XXi Xi g $ and x'i S s; therefore EiU x'ip g sp g 1 and 

XXi £*p' ^ sp ^ 1 so that 


( 9 ) 


1 ^ E V £ — ft 


4==>1 1=1 Q£ i 


and 


Now if a is a positive real number, then a + (1/a) — 2 = (1/a) (a — l ) 2 ^ 0, 
and equality holds if and only if a = 1. Adding the inequalities in (9) we have 


( 10 ) 


m / f \ m 

22sE(- < + ^)ft^2i:A-2. 

yx* a i/ i- i 


Hence a* = a 7 * (because each & > 0 ), and from ( 8 ) we then have Ui(xi) = 
Ui(x'i), as desired. 

It should be noted that in fact p and p 7 are interchangeable; i.e., p corresponds 
to x'i, * • • , x' m and p 7 corresponds to a* , * * * , x m . 

Before proceeding with proofs of Theorems 2 , 3, and 4, we require several 
lemmas. 

Lemma 2 . If a, 0 , 7 are real numbers with the property that a(l — X^) ^ 
7(1 — X) for all X in some open neighborhood of X = 1, then afi = 7. 

Proof . For X < 1 we have «(1 •— X*)/(l — X) ^ 7 , while for X > 1 we have 
a(l — X^)/(l — X) 7. However, as is readily seen, 

limx-i (1 - X fl )/(1 - X) = Pi 


thus 7 S :§ 7 and the conclusion follows: 

Lemma 3. Let ft , * • • , p m be positive real numbers such that E*-i ft = 1. 
Consider the function f: R+ m —* R+ defined by /(£ 1 , * • * , f m ) = IIZ=i 
Then / is homogeneous, continuous, and concave. 



302 


VI-26 —DETERMINISTIC DECISION MODELS 


Proof. Continuity and homogeneity are obvious. To show / concave, it suffices 
to prove that it is concave on the interior of R+ m (because / is continuous). 
In the interior of R + m , f is differentiable and we may apply the well-known theo- 
rem (see [ 5 ], p. 88) that / is concave if the quadratic form Q of its second partial 
derivatives is negative-semidefinite. Now if 2 = (ft , • • • , ft,) fe in the interior 
of R + m (i.e., z > 0 ), then 


Qz(vi, 


> 7m) — /(z) 


V T,iVi && 
-i*-l ft ft 



Since f(z) > 0, we need only show that for every (71, • - • , Vm ) the expression 
in brackets is nonpositive. To this end we avail ourselves of the Cauchy-Schwartz 
inequality, which states that if u, v £ R m then (uvf S (uu)(vv). Let u, v be 
the vectors whose i th components are VFi and mVpJU , respectively.’Then 


m 



UU = Y, Pi = 

and 

II 

**n> ci 

'‘•to 



Thus, 

i*ml £ i 


Qzivi, ,Vm) = f(z)[(uv) 2 - (uu)( vv)] g 0 . (Q.E.D.) 

Lemma. 4 . Let ft , • • • , p m be positive real numbers with £Xi ft = l For 
any z = (ft, • • • , ft) 6 R + m suc h that z ft =g 1, we have 


(ID 


nrf* ^ n^, 


and equality holds in (11) if and only if ft = ft for ah i. 

Proo/. From Lemma 3 we_ know that /, as defined there, is continuous; thus 
there is a z 0 = (ft , ■ • • , ft) which maximizes / on the compact set of all 
(ft> ■ ,ft>») 6 R+ such that ft ^ 1. A direct check reveals that (ii) of 

Lemma 1 may be applied, and we know then that there exists an rj £ R+ such 


( 12 ) 


m m 

* = vjLh = n^* 


m m 


- H for a11 ( ft > • • • . ft) € R+ m . 


Fixmg a k € {1, 2, • • • , m \ and letting ft = ft if f 3* fe and 
X € R+), we obtain from ( 12 ) 


ft = Xft (where 
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Note that rj > 0 ; otherwise, from ( 12 ), f(z) g 0 for all z € R+ m , which is 
clearly impossible. Thus, from (13), 1 - \^ k ^ f*(l - X) for all X 6 R + , and, 
by Lemma 2 , £ k = p k • this completes the proof. 

Lemma 5. The social welfare function \f/ is concave, homogeneous, and con¬ 
tinuous. 

Proof. We have ^/{x x , • * * , x m ) = JI?»i / a%{x i Y i for each distribution x x , 
Since each u { is homogeneous and continuous and = 1 , ^ is 

homogeneous and continuous. Concavity of ^ follows from the concavity of each 

and from the fact that the function / of Lemma 3 is concave and non-decreas¬ 
ing (i.e., if 0 ^ z ^ s', then f(z) ^ /(z'), which is a consequence of the fact 
that each > 0 ). 

Lemma 6 . Let g m .R+ —> R be concave and bounded below; then g is nonde¬ 
creasing. 

Proof. Suppose x, y £ R+ n , x ^ y, and g(x) < g(y). For every positive in¬ 
teger k , let = kx + (1 — k)y = k{x — y) + y. Thus, z k £ R + n and x = 
(l/fc)^ + (1 — (1 /k))y for k = 1 , 2 , • • • . From the concavity of g we then 
have ff(s) £ (I/JO 0 O&) + (1 - (l/fc))ff(y), or 


g{z k ) S k\g(x) - ^(p)] + g(y), 

which contradicts the boundedness of g. We can now prove Theorems 2 through 4 . 

Proof of Theorem 3. If x x , — , is an equilibrium distribution with asso¬ 
ciated equilibrium price vector p and y x , * • * , y m is a feasible distribution, 
Iken ViP = sp =! 1, and from ( 2 ) and Lemma 4 we have 


(14) 


II u i(Vi) * = II {aiVivY' = II MY* = II Ui(Xi) 0i . 


Thus xi, • • • , x m is a maximal distribution. On the other hand, if a* , • • ■ , x n 
is a maximal distribution and s > 0, then (in view of Lemma 5 ) we may apply 
Lemma 1 (ii), which tells us that there exists a q € R+ n such that 

m m 

Xi = H^( x i) 0< = sq and 

t=l i=l 

(15) 

m m 

qT, Vi ^ n u ii.ViY < for every distributions yi, ■ • • , y m • 

i—1 4 = 1 


If sq = 0, then (since s > 0) q — 0, which would mean that JJ%= X Ui(yi) Pi ^ 0 
for all distributions y x , * • • , y m , contradicting our assumption that each Ui is 
nonnegative and nonconstant. Let p = (1 /sq)q; the expressions in (15) then 
become 


( 16 ) 


> = $P = 1 


m m f“^ / \ 

V iL, Vi ^ II 1 for every distribution yi, ■ ■ • ,y m . 

1=1 4=1 \_Ui(X)i J 
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We now fix an integer k, where 1 g B m, and in ( 16 ) we let y { = Xi when 
i ^ k, y k — x. Thus for every x £ R+ n , we have 

( 17 ) p Vi = P/L,Xi — px k + px = 1 + p(x — x k ) ^ . 

Letting x = Xa^ , where X 6 5 + , we obtain from ( 17 ) 

( 18 ) px k {\ - 1) ^ X** - 1 

so that, by Lemma 2, px* = & . Now if x € i 2 +” and ipift, then from ( 17 ) 
we have [t 4 b (*)/n k (as b )] fc ^ 1 + px — px* g 1 + ft - ft = 1, i. e ., «»(*) g 
u k (x k ) and x k £ Dk(p); the last in conjunction with ps = 1 and XXi I s 
(this is so because a*, • • • , x m is maximal and hence feasible) shows that x t , 

• • • , x m is an equilibrium distribution. This completes the proof of Theorem 3. 

Proof of Theorem 4 . If q solves (6) and s > 0, then it can be readily verified 
that the assumptions of Lemma 1 (iii) are satisfied. Thus there exists a maximal 
distribution xl,---, x m such that qs = qYXiXi = U?-i «.-(*,•)'* > 0. 
Letting p = (1 /qs)q we obtain ( 16 ) which, as was shown in the proof of Theo¬ 
rem 3 , tells us that si, • • •, x m is an equilibrium distribution with associated 
equilibrium price vector p. 

Let p, q, x i , • , x m , and X be as in (ii) of Theorem 4 ; X > 0 then follows 

from ( 2 ). To prove that q solves (6), it suffices to show that q satisfies the con¬ 
straint inequalities of (6) because then, since qs = \ps g IJZ.iM<(xf ) fii , for 
any q' satisfying the constraint inequalities of (6) we have 

m m 

q’s ^ q , '£,x i ^ II ufxi) 13 * ^ qs 
1 


£Lnd thus qs is minima l. Let us assume then that there exists a distribution 
Vi, • • • , Vm such that q^Jt-i < Jl<^i u »(2/»)^’- Thus, by definition of p 


pHyi< 

t-i 


■Q Uj(y { ) Pi 

1=1 _ 




i.e., 

( m \ m m 

/ t-1 »-1 

which can readily be seen to contradict Lemma 4. This completes the proof of 
Theorem 4 . 

We prove Theorem 2 by first giving an explicit definition of a social utility 
function and then proving that it has the required properties. For each s £ R. n 
we let ’ 


x i , • • • , x m is a feasible distribution for s 


(20) u(s ) = max jjl u i (x i ) fli 

In particular, u(s) is precisely the minimal value of qs in (6) or, dually, the 
value of the social-welfare function at any maximal distribution. 
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(i) u is homogeneous 

Note that for each z, Z 4 t ( 0 ) =0 because Ui is homogeneous and continuous; 
hence u(0 ) = 0. Thus if x € R+ n and X = 0 , then u(\x) = 0 = Xu(x). As¬ 
sume X > 0. We know that there exist maximal (and hence feasible) distribu¬ 
tions Xi , • • * , x m and yi , * • • , y m for x and \x, respectively. It follows then that 

m m 

(21) 22 Xx { g \x and ^,— ^x; 

*—i X 

therefore by the maximally of the two distributions 

m m m / \ fit 

u(\x) ^ n UiiXxy* = X n Ui(xi) Pi = \u(x) £ xfl Ui ( p ) 

( 22 ) ‘- 1 ~ J - 1 W 

= EL Ui(yi) e ' = «(Xx), 


i.e., u(\x) = \u(x). 

(ii) u is concave 

Let x, y € R + n and let x x , * • • , x m and y x , • * • , y m be maximal distributions 
for x and y, respectively. Thus x + y ^ 22*-i (xi + y { ), because x ^ 22”-i 
and y ^ 22?-i yi , and 

u(x + y) £ Jltt u<(*< + p<)* ^ n™-i Ui(xi)** + JlT-x «i(y<)* 

= u(x) + u(y). 

(The last inequality is a consequence of Lemma 5.) We have shown then that 
u(x + y) ^ u(x) + u(y) which, together with homogeneity of u, implies 
that u is concave. 

(iii) u is continuous 

Let x k , k = 1, 2, • • • , be a sequence of points in R+ n converging to some 
x° G R+ n . We wish to show that the corresponding functional values u(x k ) also 
cnverge to u(x°). For k = 0, 1, 2, • • • , let x k , • • • , x, k be a maximal dis¬ 
tribution for x k . Thus u(x k ) = JlTLi u i (x i k ) l>i and 22*-i x k ^ x. The last 
inequality implies that the x k , , x m k are bounded; let x x , ■ ■ ■ , x m be one of 
their limit points. Then, since lim*-» x k = x°, we must have 22 *-1 s* Sa x° 
and thus u(x°) ^ H^-i Ui(xi) fii . 

We have just demonstrated that 

(23) u(x°) ^ lim u{x k ). 

Jb-foo 

Next, we assert that there is a sequence of nonnegative real numbers X* which 
converge to 1 such that \ k x° jg x k for k = 1 , 2 , - • • . If a; 0 = 0 , we let X* = 1 
for all k’ s; otherwise we define X* by: X* = min { J >0 ({■$/$/), where we set x k = 
(&*, • • • , im). That the X* converge to 1 follows then from the fact that the 
x converge to *°. Now 22”-i ^xt ^ \ k x g x k , and therefore 

u{x k ) £ n««<(W) ft - X*u(x°), 
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which shows that 

(24) u(x °) g lim u(x k ). 

k-*oo 

By combining (23) and (24), we obtain the desired result. 

(iv) for every p £ R + n , D(p) is a subset of D u (p) 

Let x £ D(p) and let Xi , * • * ,x m be an associated distribution, i.e., Xi £ Di(p) 
for all i’s and XXi x * = x - We wish to show that if y £ R + n and yp S 1, then 
u(y) g u(x). Let 2/i, * * * , y m be maximal distribution for y\ then, of course, 
XXi V<9 = VP ^ 1, and by Lemma 4 

(25) IK..(w)"S n^‘. 

Applying (2), we have: 

u{x) ^ I]>-i UiixiY* 

= W-i (*&)*< 

= nt. (ay#) 11 * 

= II^-i UiiyiY* 

=«(?/). 

(v) /or every p 6 I2+ n , D«(p) is a subset of D(p) 

Suppose x £ D„(p), i.e., u(x) is maximal subject to x € 22+", xp g 1. We 

apply Le mma 1 with k — 1, <j> = u, A = p, and i'iy) = y, a direct check re¬ 
veals that (dj) is satisfied by letting xo — 0. From (ii) of the same lemma, we 
then know that there exists a nonnegative real number y such that 


from (2) 
from ( 25 ) 
from (2) 
(Q.E.D.) 


(2g) VPX = y = u(x) and 

VPV ^ u(y), whenever y € 22 + n . 

Note that ij is actually positive; otherwise u would always be nonpositive, which 
is certainly not the case. 

If Xi, • • • , x m is a maximal distribution for x, then 


■n = u{x) = Jlt-i Ui(xi) 0i > 0, 


and for any distribution y r , • • • , y m , we have from ( 26 ) 


(27) 


_2L 1 / m \ 1 m 

pT,yi^-u(1£, Vi) ^ -j-r n Ui(yi) Bi 
i~i 17 \i _i / u(x) fJi * 



This is precisely the second inequalty in ( 16 ) so that, as shown in the proof 
of Theorem 3 , x { <E £,(p) for i = 1, ■ • • , m . Also, 1 = JXi x# g xp ^ 1, 
thus if in X{ 2= x we have strict inequality for some component, then the 

corresponding component of p is zero. In view of Lemma 6, this means that 
x € D(p ); otherwise we could increase the component in question of any of the 
xf s without altering the fact that x t 6 D { (p). 
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4. Examples 

Example 1 . We show here that if all Ui are not homogeneous, then an aggre¬ 
gate utility need not exist. There are two consumers and two goods with 

246\ 

52/ 


The incomes are given by ft = 33/52 and ft = 19/52. Clearly all conditions 
of Section II are satisfied except that the Ui are not homogeneous. It can be 
readily verified that for p = (1, 3) and q = ( 3 , 1 ) the individual demands con¬ 
tain 


v) = 52 min ( £ + 3 t?, 3£ + y + 




%(£? v) — 52 mm ^24£ + 8rj, £ + 3?? -f- 


^ (10, 3) 
= 4 (4, 7) 


demanded by ft at p 


demanded by ft at p 
demanded by ft at g 
demanded by ft at g. 


However, (x x + z 2 )g = 1 while (pi + p 2 )p = 44/52 < 1 , and both m are 
nondecreasing, hence no “savings” are possible and y x + y 2 is definitely not 
demanded at prices p. The revealed preference is thus intransitive, and an 
aggregate utility function does not exist. 

Example 2 . We illustrate here Theorems 1 through 4 by means of a model 
consisting of two consumers and two commodities, with all utilities actually 
linear. 

In general, if i?) = + a^rj (where an and a * 2 are positive real num¬ 

bers) and ft has income ft, then for p = ( 7 Q, 7 r 2 ) > 0 



if 7T 2 OLi i > 71*1 a& 

if 7 t 2 an < 7r t - 
if tt 2 a t i = 7riai 2 . 


Let us take a concrete example with incomes ft, ft > 0, where ft + ft = 1 
and 


Wi(£, = 2£ + 7j 

UziZ, v) = £ + 
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It then follows that for p = (xi, x 2 ) > 0, 



if 37 Ti < X 2 


if 3x1 — x 2 
if x 2 < 3xi < 6x 2 


if xi = 2x 2 


if 2 x 2 < xi 


If the supply « = (ai, <r 2 ) £ I?+ 2 , then equilibrium exists if and only if $ ^ 0. 
If oi = 0 and <r 2 > 0, then p — (x, 1/0*2), where x > 2/0*2, is an equilibrium 
price vector. Similarly, if <n > 0, 0-2 = 0 then p = (l/oi, x), where x > 3/<n, 
is an equilibrium price vector. However, if $ > 0 then equilibrium prices are 
unique and are given as follows: 


Casel. & < Mi < M? ; p = (§} 

O'! <72 \<7l ’ <72/ 

(1,3) 

( 2 , 1 ). 


Case *. (?- 2 ^ Mi ; 

<r2 <ri 


Case S. ^ 2: M? ; 

CTl 0*2 


P = 

P = 


0*1 *4" 30*2 

_1_ 

2^i -j” o* 2 


Furthermore, equilibrium distributions and “pay-offs” are given by 

Casel. xi = (cTx,0), = (0, <7 2 ), Ui(x x ) = 2<7i, ihix^) = 3<7 2 , 

«(s) = ( 2 <*) fc ( 3 <*)* 


Case 2. a* = (ftoi + 3/3 i<t 2 ,0), X 2 = (ftoi — 3/3 i<t 2 , < 7 2 ), 

%(^l) = 2/? 1 (<7l + 3<7 2 ), ^(%) = /? 2 (<7l + 3<7 2 ), 

w(s) = u^xO^iXi)^ = (< 7 X + 3<7 2 )(2ft)*V’ !! 

Case 3. zi = (< 71 , ft< 7 2 — 2/? 2 oi), x 2 = (0, 2<n/3 2 + a 2 /3 2 ), 

Wi(zi) = ft(2<n + «*), «,(*») = 3jS 2 (2<7i + <7 2 ), 

«(*) = = (2<7 X + 

Thus for s = (<7i, <7 2 ) e R+ 2 , 

f(2<7x + <r 2 )^ 1 (3ft) ?2 , if 2/3 2 <7i ^ ft<7 2 
m(s) = j(2<7 1 ) ?1 (3<7 2 ) 02 , if p lff2 g 2/3 2 <ti g 6/3 i<7 2 

[(vi + 3<7 2 )(2ft) 01 /3 2 , if 3 |Si< 7 2 g j8 2 <7i. 

Hence the social utility function is proportional to w x when <71 is sma l l com¬ 
pared with <7 2 ; it is proportional to <7 2 when <7 X is large compared with w 2 • 
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In the intermediate cone, u(s) is “hyperbolic” (in the sense that the level 
curves of u in this region, if extended analytically to all of JS + 2 , would be assymp- 
totic to the co-ordinate axes). 
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PART TWO 

STOCHASTIC DECISION MODELS 



YII 

COMMENTARY ON PART TWO: 
STOCHASTIC DECISION MODELS 


In most practical decision problems the decision maker is faced with un¬ 
certainties about the future—e.g., about future prices, demands, technological 
advances, and so on. The papers in this part present several models that have 
been proposed to cope with uncertainties. The papers are divided into two 
groups. The first group deals with general methods, the second group with 
applications to specific problems in inventory control. 


Programming Under Uncertainty 
Stochastic Programming 

In 1955 Beale [4], Dantzig (Chapter 28), and Radner [14] independently 
pointed out that certain “stochastic” linear programming problems could be 
reduced to ordinary linear programming problems. The method of doing this 
is described below. 

We begin by formulating the stochastic linear programming problem. Let 
A, 6, c be respectively m X n, mX 1, 1 X n matrices of real-valued random 
variables. Let £ be a vector-valued function whose domain is the set of all 
possible values of the triple ( A } b , c) and whose range is the set of all n-vectors 
of real numbers. The stochastic linear programming problem is that of choosing 
a function x in the above class that 

(1) minimizes 1 E(cx ) 

subject to 

(2) Ax = b 

(3) x > 0. 

An important special case is where the triple (. A , 6, c) has only finitely many 
values (Aij bi, ci), * • *, (A r , b r , c r ) that occur with probabilities pi, • • •, p r 
Vi = 1) respectively. In this event, denote by Xi the value of the func¬ 
tion x corresponding to (A*, 6*, a). The problem (1), (2), (3) is now completely 
equivalent to the ordinary linear programming problem of finding n-vectors 
• * •, x r of real numbers that 

r 

(1)' minimize 2 

i=l 


1 E stands for expected* value. 
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subject to 
( 2 )' 

(3)' 


AiXi = bi 
> 0 


(i = 1, 
(i — 1, • 


-,r) 

■ ,r). 


a\ the problem formulated above > * is allowed to depend on all elements of 
(a i In effect ’ ^decision maker can observe the value of each element of 
(A, 6, c) before choosing the values of his decision variables. In the most 
interesting applications, however, the decision maker is required to select the 
value of each coordinate of x after observing only a specified subset of the values 
of the elements of (A, b, c). For example, suppose decisions are made at several 
different points m time. Then decisions at one point will typically be made 
after observation of the values of fewer random variables then would be ob¬ 
served for decisions at a later point in time. This feature is characteristic of 
most of the papers m this part. A second example occurs in large organizations 
where complete circulation of information to all decision makers is not eco¬ 
nomical. In such cases decisions in different parts of the organization—for 
example, the sales and production departments of a manufacturer—would be 
based on different (possibly overlapping) subsets of the values of the elements of 
(A, b, c). Eadner discusses an interesting example of this type in Chapter 27 
In the above examples we are asking that the class of admitted functions x 
be limited by restrictions of the following type: the jth coordinate of x, say x 3 
is allowed to depend upon only a specified subset of the elements of (A be) 
In the special case discussed above in which (A, b, c) takes on only finitely 
many values, the new restrictions still permit reduction of the stochastic linear 
programming problem to an ordinary linear programming problem. To see this 
reduction, let Tj(A, b, c) be the triple formed by replacing each element of 
(A, 6, c) upon which x 3 is not allowed to depend by zero. Then we may ensure 
that x does not depend upon the collection of random variables indicated above 
by imposing the constraints 

(4) 


Xf = x{ 


'?';*"*• k “ uchth “ T .< A " h C “ " T - ,A ‘" t *. «*> 4 is, of course, 

ar « // he , MW problem » f « , * to saw; 

w j w } , (4) is still one m linear programming. 

, T1 \ e mach outlined above for stochastic linear programming problems can 
be extended easily to stochastic non-lmear programming problems. Specifically 
suppose we replace cx and Ax = b in (1) and (2) by c (x) and A(x) < b re¬ 
spectively, where c(-) is a real valued random function and A(-) is a vector¬ 
valued (m coordinate) random function. Then if the triple (A, b, e) has finitely 
many values, an ordinary non-linear programming problem analogous to (1)', 

(3) 2,In aCtUal COmputations ’ (4) would be used eliminate variables from the problem (1)'- 
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(2)', (3)', (4) can be formed. In the event that c(-) and A(-) are convex, known 
techniques of non-linear programming are available for solving the problem. 
If either c(-) or A(-) is not convex, the computational problem is always formid¬ 
able. 

There is one special but very interesting case in which an extreme simplifica¬ 
tion in the computations results. Specifically, suppose Q(x, c) is a convex 
quadratic function of (x } c), with c being a vector of random variables. The 
problem is to find an x that 

(1)" minimizes EQ(x, c) 

subject to (2). Observe that the non-negativity restriction on x is not retained 
and that A is not random. It turns out that one can solve (1)", (2) by re¬ 
placing c and b by their expected values and solving the resulting problem with 
the classical calculus. The resulting solution 2, say, is the expected value of the 
function x that is optimal for the original problem. This result is useful because 
it enables determination of the values of those coordinates of x that are not 
permitted to depend upon b or c. Values of other coordinates may be deter¬ 
mined after the appropriate coordinates of b and c are observed. The result 
was first established by Simon [17] and is the principal result upon which the 
book by Holt et al. [12] is based. 

Dynamic Programming Under Uncertainty 

There is an alternative way to formulate stochastic decision problems in 
which decisions are made sequentially in time. Specifically, we suppose that a 
process can be in one of N “states” labeled 1, 2, * • •, N. Initially the process 
is in state 1, and it is desired to guide the process through a sequence of inter¬ 
mediate states to state N as profitably as possible. However, the decision maker 
can only partially control the evolution of the process from state to state. 
Instead, when the process is in state i , there are K decisions available, labeled 
1, • • *, K. If decision k is chosen, the known probability that the process then 
goes to state j is We suppose that the states are so numbered that the 
process cannot enter state j from state i if j < i. Thus, 

Z V k a = 1 (i = 1, • • • , N — 1; k = 1, • • ■, K). 

3=i +1 

There is a cost cm incurred when the process is in state i and decision k is made. 
The problem is to choose decisions to be made in each state so as to minimize 
the total expected cost. 

Let xa be the (unknown) joint probability that the process enters state i 
sometime during its evolution from state 1 to state N and that decision k is 
made at that time. The problem is then to choose xm that 

(5) minimize ^ cu&ik 

i,k 
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subject to 


23 x ik = 1 

(6) ^ 

13 x ik ~ 23 13 Pijxa = 0 O' = 2, • • •, IV — 1 ) 

k jc 

(7) (t= 1,---,2V- 1;A= 1,...,*). 

It is easy to see that one set of optimal dual variables /j, • • • ,/iv— i(/jv s 0) 
for this linear programming problem satisfies 


min|c ifc + 23 V %fj 

h K. i-i+i 




This formula is familiar in dynamic programming and permits Sn—\ /jv-2 • • • U 
to be calculated recursively in the order given. Evidently, one can interpret 
as the total expected cost incurred when the process starts in state i and optimal 
decisions are made thereafter. The optimal decision for state i is determined 
as usual, as the minimizing value of fc in (8). ’ 

The formulation given above generalizes the discrete dynamic programming 
model developed ((11)-(15)) in Part One. The formulation also extends to 
situations in which there are infinitely many states and decisions 
The first linear programming formulation of a stochastic decision problem 
along the above lines was given by Manne in the article reprinted as Chapter 29. 

e considered an infinite period problem in which revisitation of states is 
allowed and in which the objective is to minimize the long run average cost 
per unit time. D Epenoux [7] then showed that a similar formulation was 
possible for the infinite stage model in which future costs are discounted and the 
objective is to minimize the total expected discounted cost. In fact the model 
given above is really a special case of his model 
The first complete account of the models described above-although not 
Howard [13]^ Pr ° grammmg view P oint ~ is given in the excellent book of 

There has been no systematic comparison of the relative advantages of the 

a nd dynamic programming approaches to stochastic 

rT’ kU0Wn that each approach is computationally 

thlt boTh °?f f ° r C6rtain ClaSS6S ° f problems - We remark also 

aDDlic^l? Th 0aC f eS + K ur ry ^ programmin g problems in interesting 
pplications. This fact probably accounts for the relatively small number of 

ES aPPllCatl0nS fi 0f these models - development of the decomposition 

inTin h ”, a f g “ fiCan , t St6p in attemptin g t0 the structure appear- 

putatiom aQd ^ krge SCale programmin g Problems to simplify com- 

Stochastic Constraints 

th«t f° rn \ Ulatlng St0C 1 haSt i° decision Problems one sometimes wishes to require 
that a certain inequality hold but must omit the restriction because it would 
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occasionally lead to infeasibility. In such cases an alternative is to require that 
the inequality hold with a given probability. This kind of restriction is called 
a stochastic constraint. 

Stochastic constraints can be accommodated in either the stochastic pro¬ 
gramming or dynamic programming formulations of decision problems. In the 
stochastic programming formulation we replace (2), (3) with 

(9) P(a,iX < fa) > cti (i = 1, * • •, m) 

where a; is the zth row of A, Pi is the ith coordinate of 6, and the a z - are scalars 
with 0 < ai < 1. 

Unfortunately it seems in general to be difficult to solve problems of this sort. 
Charnes, Cooper, and Symonds in Chapter 33 and Charnes and Cooper in 
Chapter 30 have suggested approaches that are suitable in special cases. More 
recent work along the same lines is given in Charnes and Cooper [5] and Thomp¬ 
son et al. [10c]. We describe here a method that is given in Charnes and Cooper 
[5] and generalizes an idea that first appeared in the paper reprinted as Chap¬ 
ter 33. 

We impose the following assumptions: 

(i) the ai and c are given fixed vectors; 

(ii) b = (pi) is a vector of jointly normally distributed random variables; and 

(iii) x is restricted to functions of the special form 

(10) x = Db + y 

where D is a given n X m matrix and y is an n-vector that is not permitted to 
depend upon b but is allowed to vary. 

Now substituting (10) into (9) we get 

(11) P(a%y < Pi — aiDb) > ai (i = 1, • • •, m). 

Clearly pi — aiDb is normally distributed with mean fXi = E(Pi) — aiDE(b) 
and variance of. 3 Thus letting $(t) denote the standard normal cumulative 
distribution, (11) becomes 

(12) 1 - $ (^ iV > of (i = m). 

Let Ki be such that $(Ki) = 1 — a*. Then since $ is monotone increasing, 

(12) is equivalent to 

(13) a { y < m — Ki<Ti (i = 1, • • •, m). 

In a similar manner, substituting (10) into (1) gives the equivalent objective 
function 

(14) minimizes cy. 

Now the problem of finding an n-vector of real numbers y satisfying (13), (14) 
3 Since £»• — aiDb is linear in the /3y, the variance is easily calculated. 
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has been reduced from a problem with stochastic constraints to an ordinary 
linear programming problem. 

Stochastic constraints can also be accommodated in the dynamic program¬ 
ming formulation of decision problems. In this case a typical stochastic con¬ 
straint takes the form of requiring that a subset S of the states 2, • • • N - 1 
be entered with probability not exceeding a(0 < a < l). This constraint re¬ 
quires, for example, that we add to the system (5)-(7) the inequality 


(15) 


X) x ik < a. 

itS k 


In this event the dual variables do not ordinarily satisfy (8), so that a simple 
recursive solution of the dual problem is not possible. However, the simplex 

mC wi° ° r an ^ ^ Cr a ^> or Hhm that is available for solving linear programming 
problems could be used to solve (5), (6), (7), (15). 

Inventory Control with Stochastic Demands 

During the last fifteen years, there has been a steady stream of research 
? ap ®” inventory problems with stochastic demands. Several books [2, 11, 
12, loj that give accounts of this work are now available. The stimulus for this 
work was the pioneering paper of Arrow, Harris, and Marschak [3], They 
cumulated a dynamic single product model (among others) with the following 
characteristics: The demands occurring in each of an unbounded sequence of 
periods are independent and identically distributed random variables. An (s S) 
ordering policy is followed. This ordering policy requires that if the amount i of 
inventory on hand before ordering in a period is less than s, an order for S - x 
units is placed; otherwise no order is placed. There is no lag in delivery of orders; 
and unsatisfied demand m a period is lost. A cost structure (oidering, holding, 
pena y costs) is imposed and a method of computing the long run (equiv- 
lent) average cost per period is developed using renewal theory. The resulting 

t ? n be m iuhuized to find an (s, S) policy that is optimal, 
dditional investigations along these lines are given by Beckman in Chapter 34. 

.^wu'Tn 0 , Arr ° W ’ Harri8 ’ and Marschak inspired Dvoretsky, Kiefer, 

a^hStt + f’ 9 ? apply ' deas used in decision th eory [19] and sequential 
analysis [18] to formulate and solve (at least formally) very general forms of 

the basic inventory model outlined above. In particular they showed (in 
principle) how to determine ordering rules that are optimal among all rules— 

r,r th ° Se ° l h ! i*’ S) tyPe ‘ lQ S ° doing ’ the y “troduced the functional 
equation approach of dynamic programming to this area of application. 

iave ® tl f tl ° ns of dynamic stochastic inventory models have been 
directed toward establishing simple sufficient conditions on the demand dis- 
ributions and cost structure that ensure the optimality of simple ordering 
pohcies (e.g., the (s,S) policy). These studies have also led to simple and 
efficient computational procedures for determining optimal policies. Scarf [16] 
gives an excellent recent survey of these results. J 

Investigations of this type were initiated by Bellman, Glicksberg, and Gross 
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TABLE II 

Characteristics of Inventory Models 


Chapter 

31 

32 

33 

34 

35 

36 

Number of periods 







Single 






X 

Finite 

X 

X 

X 


X 


Unbounded 

X 

X 


X 

X 


Demand process 







Stationary 

X 



X 

X 


Non-stationary 


X 

X 




Known distribution 

X 

X 

X 

X 

X 

X 

Unknown distribution 


X 





Special distributions 


X 

X 

X 


X 

Review 







Periodic 

X 

X 

X 

X 

X 


Continuous 




X 



Delivery lead time 







Zero 

X 


X 




Non-negative 


X 


X 

X 


Unsatisfied demand 







Backlogged 


X 

X 

X 

X 


Lost 

X 

X 





Costs 







Stationary 

X 

X 


X 

X 


Non-stationary 



X 




Ordering 







Linear 

X 

X 



X 


Fixed + linear 




X 

X 

X 

Convex 

X 

X 

X 




Storage and penalty 







Linear 

X 


X 

X 


X 

Fixed + linear 

X 






Convex 


X 



X 


System Organization 







Single facility 

X 

X 

X 

X 



Parallel facilities 






X 

Series structure 





X 



in the article reprinted as Chapter 31. They showed that the (s, S ) (with s = S) 
policy is optimal for the Arrow-Harris-Marschak model under fairly weak 
assumptions, provided that the ordering cost is linear. For fhe unbounded period 
model, an optimal value of S(=s) is found as the root of a single transcendental 
equation. The analysis exploits the special structure of the appropriate func¬ 
tional equation. This work is generalized by Karlin [2a] and Karlin and Scarf 
[2b] in several ways—e.g., to allow backlogging of unsatisfied demands and a 
delivery lag. Further generalizations to the case where the demand distribu- 
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tions vary over time and where the demand distributions are initially known 
only up to an unknown parameter are given by Karlin in Chapter 32. Scarf 
[15] established the brilliant result that an (s, S) policy is optimal among all 
policies, provided that the ordering cost is the sum of a cost proportional to the 
amount ordered and a fixed cost for placing an order, that the holding and 
penalty cost function is convex, and that unsatisfied demand is backlogged. 

A different approach to finding optimal inventory policies for finite period 
models is proposed by Charnes, Cooper, and Symonds in Chapter 33. They 
use stochastic constraints and seek an optimal linear decision rule of the type 
discussed earlier. This approach permits reduction of the problem to a deter¬ 
ministic inventory problem of the type described in the Commentary on Part 
One. Thus the special algorithms (e.g., Chapter 12) for solving deterministic 
inventory problems can be used to solve the stochastic problem. 

All of the models discussed above have been concerned with stocking de¬ 
cisions made at a single point. In Chapter 35 Clark and Scarf consider a general¬ 
ization in which there are several echelons in which inventory is stored. And in 
Chapter 36 and in Allen [1], Allen analyzes a one period model in which the 
problem is to determine how stocks of a single product at several locations 
can be redistributed to minimize combined shortage and redistribution costs. 

The main features of the models discussed above are given in Table II. 
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THE 0F linear programming 

TO TEAM DECISION PROBLEMS* 1 

ROY RADNER 

University of California, Berkeley 

In a team decision problem there are two or more decision variables, and these 
different decisions can be made to depend upon different aspects of the environ- 
ment or information variables, the resulting payoff being a random variable. 

The choice of optimal rules for selecting information variables and for making ' 
decisions is the central problem of the economic theory of teams. This paper 
shows, by means of an example, how linear programming can be applied to ob¬ 
tain optimal team decision functions in the case in which the payoff to the 
team is a convex polyhedral function of the decision variables. 

1. Introduction 2 

In a team decision problem there are two or more decision variables, and these 
different decisions can be made to depend upon different aspects of the environ¬ 
ment, i.e., upon different information variables. The choice of optimal rules for 
selecting information variables and for making decisions is the central problem of 
the economic theory of teams. In a previous paper [1], Marschak has given an 
introduction to the main concepts of this theory. In the present paper I shall 
show how the technique of linear programming can be used to solve a typical 
class of team decision problems. 

The “character” of a decision problem is. determined by the form of the func- 

\° b6 “ Zed ’ wMch 1 ShaU caI1 the ^yoff function. Much of the 
available data about business leads naturally to the formulation of decision 

Z T °[ W 1 ha1 ; might be Called convex polyhedral payoff functions; 

i.e., problems for which the space of decision variables can be divided into 
regions, whose boundaries are linear, such that within each region the payoff is 
abn«ff fimetaon of the decision variables. As is well known, such a problem is 
. , , 0 ear jamming, and as I have shown in another paper [2], the 

mtrochiction of probabilistic uncertainty, and of the further comphcatioL of a 
team situation, does not destroy the linear character of a programming problem 
although it may result m a substantial increase in the “size” of the problem. 

n this paper I will illustrate these ideas by means of an example; a general 

* E ”“ ° Bder 

* 1 0111 indebted t0 A ‘ Manne and J ‘ Mar ^k for helpful comments on this paper. 
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formulation has been given in the paper just referred to, but the reader will 
probably have no trouble in providing such a generalization himself. The ex¬ 
ample to be used is about as simple as it can be without sacrificing any of the 
three features that I want to illustrate, namely, (1) uncertainty, (2) the fact that 
different decision variables can be made to depend upon different information 
variables, and (3) a nondegenerate convex polyhedral payoff function. (Therefore, 
the reader should not expect too much in the way of realism!) 3 

Within the framework of the example, I shall (1) show how to apply linear 
programming to compute optimal decision rules for any particular structure of 
information and communication; (2) compare different structures of information 
and communication; (3) discuss an effect of joint constraints on the decision 
variables, in a partly "decentralized” team; and (4) point out the relationship 
between team decision problems and sequential decision problems. 

2. An Example 

Consider a "firm” with two activities, which I will label "production” and 
"promotion,” and suppose that the levels of expenditure on these two activities 
must be chosen for one period to come. Let a denote the amount of money 
alloted to production, and let xa be the resulting quantity produced (there is only 
one commodity concerned), where x can be interpreted as the "productivity” of 
the production activity. Similarly, let b denote the amount of money alloted to 
promotion, and let yb be the resulting demand generated. The quantity actually 
sold will therefore be the smaller of the two quantifies, xa and yb] if both the 
product and the demand generated are "perishable,” and if the units are chosen 
so that the price of the commodity is 1, then the profit resulting from the pair of 
expenditures ( a, b) is 


min (xa, yb) — (a + b). 

If the business were at all profitable, then the firm would of course expand 
its scale of operation indefinitely, were it not for the fact that its immediate 
supply of capital is limited. This limit is not absolute, but there is a substantial 
cost attached to obtaining more capital than is immediately available. Letting k 
denote the capital limit, and (1 + /) denote the cost per dollar of additional 
capital, the firm’s profit, as a function of the decision variables a and b, is given 
by the payoff function 

(1) u(a, b] x, y) = min (xa, yb) — (a + b) — /max (0 , a + b — k ). 

If 0 < xy/x + y — 1 < /, then the function u just defined is indeed convex 
and polyhedral, and its contours are shown in figure 1. 

It is easy to see that, for given x and y, the function u attains its maximum 
when 

(2) ax = by, a + b = k, 

3 A. Manne has aptly described this type of example as “allegorical” rather than 
"realistic”. 
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that is to say, when 


' s u, = -- 

x + v’ 

and the maxhnum value of u is 


max u 

a,b 


\x + y 


x + y’ 


— 11 *. 


Suppose now that the firm is uncertain about the actual values of the “pro- 
U j ™ f Parameters, x and y, that will prevail during the period in question, 
and that these values can be predicted accurately only at some substantial cost. 
Two extreme alternatives suggest themselves. The firm could pay the cost and 
obtain_ the relevant information, and then make the appropriate decisions, as 
given by equation (3). This alternative will be called the case of full information. 
On the other hand, the firm could rely only upon its knowledge of the probability 
distribution of x and y, which is assumed to be known, and choose that pair of 



Fig. 1 . Iso-profit curves for u ( a , b ; x, y ) 
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expenditures that maximizes the expected, or average, profit. This will be called 
the case of routine operation. Each of these two alternatives involves a different 
structure of information. Which alternative is the better depends upon which 
one results in the higher expected profit, net of the cost of information. 

A third, intermediate, alternative is suggested under a circumstance that has 
been described by Marschak as “cospecialization of action and information.” 
In this case, it costs less for the person in charge of production to get the needed 
information about the parameter x than it does for the person in charge of pro¬ 
motion to do so, and the reverse holds for the parameter y . If, in addition, com¬ 
munication between these two persons is costly, it may be desirable to have the 
decision about the variable a made by the production manager only in the light 
of knowledge about x, and the decision about the variable b made by the pro¬ 
motion manager only in the light of knowledge about y, all however according to 
a decision rule agreed upon in advance. This last alternative will be called the 
case of decentralization. 

The possibility of costly communication may seem far-fetched in the con¬ 
text of this simple example; however, if instead of a and b one thinks of two fairly 
complicated sequences of decision, with each person (or department) getting new 
information all the time, then it might indeed be costly to achieve a complete 
exchange of information between the two. 

As a primary step toward solving the over-all problem of choosing both a best 
information structure and best decision rules, one must, at least in principle, 
solve the various “sub-optimizing” problems of choosing the best decision rules 
for given information structures, and this is the type of problem that will be 
considered in detail in the rest of this paper. Before doing so (in the next section), 
it may be helpful to look at the results for some given numerical values of the 
parameters. 

Suppose that x and y are statistically independent, and can each take on one 
of two values, with equal probability, the values being given in table 1. Suppose, 
furthermore, that the amount of free capital (fc) equals 1,000 dollars, and that the 
cost of additional capital (1 + /) is 2.7 dollars per dollar. (It is clear from equa¬ 
tions (2) and (3) that the value of k merely determines the scale of operation, and 
does not influence the relative expenditures.) The maximum possible expected 
profit for each of the three alternative information structures described above is 
given in Table 2. 

In the “routine” case, the decision rules are, in a sense, degenerate; a best 


* 


3.0 

4 


TABLE 1 

Joint Probability Distribution of x and y 


2.8 


1/4 

1/4 


y 


3.6 


1/4 

1/4 
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ci j “ h J ,m * te value of 4 are to be chosen.Iuthe “decentralized 

caee however, ap, of values (u,, a,) and a pair of values (4,, W ^eTS 

W f' T d “°? S the “Penditure that will be made by the production 

soon/ \ 16 f ' a T! t iat x mIi l “ LVe the value 3.0, a- t is the expenditure corre- 
spondmg to a - 3,4, etc. In the “Ml formation” case, themsreZTvatas 

hi' be Ch0 * n ' where »•> d ‘”» tes expenditure that will 

e'c Tablc St k *° 4116 p “ ir ot P""neter values fc, 

W td vS * ^fa each of the three information sttt 
tures, and Table 4 shows the resulting allocations of resources 

is tor^Ter ae f t‘ST f the °» (fa *bi» mhnerical example) 

is that, under the best decision rules, the capital limit is actually exceeded by a 

small amount whenever « and 4 both take on their largest values TZZt 1? 

occurs with probability i. On the other hand, in the “routine” and “full infonna- 

on cases one could as well, from the beginning, have imposed the constraint 

f A \ 


(4) 


a + b = k 


and taken the payoff function to be 

w(a, b; x, y) = min (xa, yb) - (a + b). 

i 4) r z d “ w °” ****** “ 

, case would be too stringent. In the present numerical example such a 
constraint would reduce the expected profit from 541 to 534, or by 7 5% of the 

‘ ? e r °" tta “ d wi -wfa Lit 

fn tw’/ . f *! 10113 are based u P° n different information variables as 
Wallv h r h H 6d T 6 ”/ Certain degree of Iack of complete coordination will 

ouuotr ’ and to rr that a ^ constraint — be vSS 

weighted against the JSSJaSS ? 6 ^ ^ & Vi ° lati ° n “ 

stmltoeTbSysThi^hr am f 6 ?““* maDy ° ther conceivab] e information 
structures besides the three already mentioned. Some of these will be mentioned 


TABLE 2 

Maximum Expected Profit 



Routine 

Decentralized 

Full Information 

Maximum expected profit 

503 

541 j 

- . 

592 



TABLE 3 
Eesi Decision Rules 


Routine 

Decentralized 

Full Information 

a = 486 
b « 514 

~ 512 
at = 452 
bi = 548 
bt = 426 

an = 483 bn = 517 

an = 452 bn = 548 

aa = 545 bt! = 455 

aas = 514 622 = 486 
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TABLE 4 

A llocations of Resources Under Best Decision Rules 


Parameter Values 

Best Decision Rules 

Routine 

Decentralized 

Full information 

x = 3.0 

a * 486 

512* 

483 

00 

II 

^3 

b - 514 

548* 

517 

3.0 

486 

512 

452 

3.6 

514 

426 

548 

3.4 

486 

452 

545 

2.8 

514 

i 

548 

455 

3.4 

486 

452 

514 

3.6 

514 

426 

486 


* Total expenditure exceeds immediate supply of capital. 


in Section 4. In the next section I will take up the problem of computing best 
decision rules for any given information structure. 

3. Computing the Optimal Decision Rules 

The procedure for computing the optimal decision rules for a given informa¬ 
tion structure involves converting the team decision problem into an equivalent 
linear programming problem. The following discussion is in terms of two decision 
variables and two random parameters, in order to make more transparent its 
relation to the example of the previous section, but the generalization to any 
number of decision variables and random parameters is obvious. 

Suppose that 

(5) u(a, 6; x , y) = min /„(a, b;x,y), (n = 1, * • • , N) 

n 

where, for every n, x, and y , f n is linear in a and 6. Suppose also that x and y 
take on only a finite number of values, with probabilities p(x, y). Furthermore, 
let 

r = R(x, y) be the information on which action a is based, 

$ = S(x, y) be the information on which action b is based, 

A denote any function of r (a decision rule for a), 

B denote any function of $ (a decision rule for 6), 

Z denote any function of x and y. 

Then the following two maximization problems are equivalent, in the sense that 
the maximum values are the same, and (A, B ) is a solution of Problem I if and 
only if there is a Z such that (Z, A, B) is a solution of Problem II. 

Problem L Choose A and B so as to maximize 

(6) Eu(A[R(x, y)], B[S(x, y)]; x, y), 
subject to A(r), B(s) nonnegative. 
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Problem IL Choose Z, A and B so as to maximize 

(7) EZ(x, y ), 

subject to Z(z, y), A(r), B(s) nonnegative, and to the further constraints that 

(8) Z(x , y) S fn(A[R(x, y)] 9 B[S(x , y)]; x, y) 

for every n, x and y. (Note: the symbol E denotes mathematical expectation with 
respect to the random parameters x and y.) 

Since EZ(x, y) = p(x, y)Z{x , y) is a linear function of the “variables” 
Z{x , 2 /), and since the constraints (8) are linear in Z(x, y), A{r) and B(s), Problem 
II is a standard linear programming problem. 

Returning to the example of the previous section, let the function u be given 
by equation (1); then it is easy to see that u can be expressed in the form (5) 
by taking 

fi = (x - 1 )a - 6, 

h = (x — 1 — f)a - (1 + f)b +• fk, 

fz = ~a+ (y - 1 )&, 

fi == — (1 + f)a + ( 2 / — 1 — f)b + fk . 

(These 4 functions correspond to the regions I-IY, respectively, in Fig. 1.) 
Consider the decentralization example; there one has the information structure 

(10) R(x , y) = x, S(x , y) = 2 /. 

Suppose, furthermore, that x and y can each take on one of two values, as in the 
numerical example; then A will take on one of two values, say a x and a 2 , according 
as x equals x x or x 2 ; and likewise for J5. Z(x, 2 /), however, will take on one of four 
values, say z i3 -, corresponding to the four pairs (xi , yf). In this case Problem II 
takes the form: 

Choose zn , ai , bj , so as to maximize pijZij , subject to z i3 , a t -, bj non- 


TABLE 5 


Constraint Matrix 


Zll 

Zl* 

«21 

tn 

01 

0 2 

61 



E 

0 

0 

0 

-Gi 

0 

-#l 

0 

F 

0 

E 

0 

0 

—Gi 

0 

0 

—H 2 

F 

0 

0 

E 

0 

0 

— (72 

-Hi 

0 

F 

0 

0 

0 

E 

0 

— (r 2 

0 

—H 2 

F 


where 
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to region 



“r 


Zi — 1 


-1 


“0" 

E - 

1 

1 

» Oi * 

— 1 — / 
-1 

, H,= 

-1 -/ 

Vi - 1 


/A. 

0 


_i_ 


- - 1 - / . 


1 

rH 

1 

i 

_A. 


on Fig. 1 
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negative, and the set of linear constraints presented in matrix (“detached coeffi¬ 
cient”) form in Table 5. 

The fortunate pattern of l’s and 0’s in the left half of the constraint matrix 
of Table 5 is characteristic of a linear problem derived from one with a poly¬ 
hedral profit function; from a computational point of view, the addition of the 
variables z does not represent a significant increase in the number of variables. 
The scattering of 0’s throughout the right half of the constraint matrix is typical 
of a team decision problem. 

More generally, for the decentralized case in this example, if x can take on I 
values, and y can take on J values, then for Problem II there will be IJ + I + J 
variables, and 41J constraints. Because of the special structure of this problem, 
the dual will always be considerably easier to solver than the primal form. In 
order to solve the dual, it should not take substantially more computing effort 
than a linear programming model with I J constraint equations. 

4. Interpretation of Sequential Decision Problems as Team Problems 

Thus far in this paper the different decision variables in a team decision problem 
have been interpreted as the decisions of different 'persons. Another class of prob¬ 
lems with the same formal structure arises from sequential decision problems for 
even a single “person” (e.g., inventory and production scheduling problems). 
In this case the different decision variables correspond to decisions taken at 
different points of time. Thus, if there is a decision to be made in each of two 
successive time periods, and information about the parameter values also tends 
to become known sequentially, then, using the notation of the le mm a of Section 3, 
either of the following information structures is likely to be relevant: 



[R(x, y ) 

= constant 

(11) 

[Six, y) 

= X 

or 


pO, y) 

= x 

(12) 

j<S(s, y) 

= 0, y) 


The technique of Section 3 applies just as well, of course, to these information 
structures as it did to the ones considered there. However, the special “triangular” 
character of the information structures that arise in single-person sequential 
problems often leads to computational simplifications that do not apply to team 
problems in general. On the other hand, it is clear that sequential or “dynamic” 
elements can be incorporated into a team decision problem, without altering the 
basic mathematical framework. 
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Summary 

The essential character of the general models under consideration is that ac- 
tivities are divided into two or more stages. The quantities of activities in the 
first stage are the only ones that are required to be determined; those in the 
second (or later) stages can not be determined in advance since they depend 
on the earlier stages and the random or uncertain demands which occur on or 
before the latter stage. It is important to note that the set of activities are as¬ 
sumed to be complete in the sense that, whatever be the choice of activities in the 
earlier stages (consistent with the restrictions applicable to their stage), there 
is a possible choice of activities in the latter stages. In other words it is not pos¬ 
sible to get in a position where the programming problem admits of no solution. 

The initial work on this paper was stimulated by discussions with A. Ferguson 
who proposed that linear programming methods be extended to include the case 
of uncertain demands for the problem of optimal allocation of a carrier fleet to 
airline routes to meet an anticipated demand distribution. The application of 
the theory found in this paper to his problem (discussed later under Example 4) 
will be the subject of a separate joint paper. The case of certain demands was 
discussed earlier [4], 

A complete computation procedure is given for a special class of two-stage 
linear progr ammin g models in which allocations in the first stage are made to 
meet an uncertain but known distribution of demands occurring in the second 
stage. This case, applicable to many practical problems constitutes the principal 
part of the paper. Next, a class of models is considered where the activities are 
divided into two or more stages. The quantities of activities in the first stage are 
the only ones that can be determined in advance because those in the second and 
later stages depend on the outcome of random events. Theorems on convexity 
of the objective (cost) functions are established for the general m-stage case. 

, Example 1: Minimum Expected, Cost Diet. A nutrition expert wishes to advise 
his followers on a minimum cost diet without prior knowledge of the prices [6]. 
Since prices of food (except for general inflationary trends) are likely to show 
variability due to weather conditions, supply, etc., he wishes to assume a dis¬ 
tribution of possible prices rather than a fixed price for each food, and determine 
a diet that meets specified nutritional requirements and minimises expected costs. 

330 
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Let Xj be the quantity of j th food purchased in pounds, pj its price, and a t *y be the 
quantity of the i th nutrient (e.g., vitamin A) contained in a unit quantity of the 
j th food, and hi the minimum quantity required by an individual for good health. 
Then the Xj must be chosen so that 

n 

(1) X QijXs ^ hi Xj ^ 0(i = 1 , 2, • * *, m) 

7-1 

and the cost of the diet will be 

(2) C = X P&i • 

j-i 

The Xj are chosen before the prices are known so that the expected costs of such 
a diet are clearly 

(3) Exp C = X Vi x i 

i 

where pj is its expected price. Since the pj are known in advance, the best choices 
of Xj are those which satisfy (1), minimize (3). Hence in this case expected prices 
may be used in place of the distribution of prices and the usual linear program¬ 
ming problem solved. 1 

Example 2: Shipping to an Outlet to Meet an Uncertain Demand. 

Let us consider a simple two-stage case: A factory has 100 items on hand which 
may be shipped to an outlet at the cost of $1 apiece to meet an uncertain demand 
d 2 . In the event that the demand should exceed the supply, it is necessary to 
meet the unsatisfied demand by purchases on the local market at $2 apiece. The 
equations that the system must satisfy are 

100 = Xn + X 12 

(4) d 2 = #n + x 2 i — x 22 (xjj ^ 0) 

C = x 11 + 2x21 

where xn = number shipped from the factory, x 12 = number stored at factory; 
X 21 = number purchased on open market, x 22 = excess of supply over 
demand; 

d 2 = unknown demand uniformly distributed between 70 and 80; 

C = total costs. 

It is clear that whatever be the amount shipped and whatever be the demand 
di , it is possible to choose x%\ and X 22 consistent with the second equation. The 
unused stocks xi 2 +■ x 22 are assumed to have no value or are written off at some 
reduced value (like last year’s model automobiles when the new production comes 

1 In some applications, however, it may not be desirable to minimize the expected value 
of the costs if the decision has too great a variation in the actual total costs. H. Markowitz 

(5) in his analysis of investment portfolios develops a technique for computing for each pos¬ 
sible expected value the minimum variance. This enables the investor to sacrifice some of 
his expectation to control his risks. 



332 


VIII-28—STOCHASTIC DECISION MODELS 


in). To illustrate some of the concepts of this paper, a solution will be presented 
later. 

Example 3: A Three-Stage Case . 

For this purpose it is easy to construct an extension of the previous example 
by allowing the surpluses #i 2 and # 22 to be carried over to a third stage, i.e., 

1st stage 100 = xn + x 12 


2nd stage 


3rd stage 


+ #23 + #24 


+ #22 + #23 
+ 2x 2 1 + #23 


+ #31 ~ 
+ 2#si 


where x 23 = number shipped from factory in 2nd stage, x 24 = number stored at 
factory in 2nd stage; 

70 = number produced 2nd stage; 

d 3 = unknown demand in 3rd stage uniformly distributed between 70 or 

80 j 

x 31 = number purchased on the open market in 3rd stage, x 32 = excess 
of supply over demand in 3rd stage. 2 

It will be noted that the distribution of d 3 is independent of d 2 . However, the 
approach which we shall use will apply even if the distribution of d 3 depends on 
d 2 . This is important in problems where there may be some postponement of the 
timing of demand. For example, it may be anticipated that the potential refriger¬ 
ator buyers wUl buy in November or December. However, those buyers who 
faded to purchase m November, will affect the demand distribution for December 

Example 4: A Class of Two-Stage Problems. 

In the Ferguson problem and in many supply problems the total costs may be 
divided into two parts: first the costs of assigning various resources to several 
destinations j and second the costs (or lost revenues) incurred because of the 
failure of the total amounts u x , « 2 , • • • , «„ assigned to meet demands at various 
destinations m unknown amounts d x ,d 2 , - • • , d n respectively. 

™V pecial class of two-stage programming problems we are considering has 
the following structure. For the first stage: 


Xij Uj 

J-l 


(xa ^ 0) 


^ 'll buxn = uj 

i«* 1 

2 No solution for this example will be given in this paper. For this case perhaps the sim- 
piest approach is through the techniques of dynamic programming; see I. Belhnan [1], 
The remarks of this section apply if (6) and (7) are replaced more generally by AX = a, 
BX-U where X is the vector of activity levels in the first stage, A and B are given mat- 
rices, a a given initial status vector, and U = (ui , w 2 , • • • f u n ). 
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where Xij represents the amount of i th resource assigned to thej^ destination and 
bij represents the number of units of demand at destination j that can be satis¬ 
fied by one unit of resource i. For the second stage 

(8) dj = Uj “f* Vj — (j = 1, 2j * • *, n) 

where Vj is the shortage 4 of supply and s y is the excess of supply. 

The total cost function is assumed to be of the form 


(9) 


m n 


C=II CijXiS + 

i-1 J-l 


E CtjVj 
j-l 


i.e., depends linearly on the choice x^ and on the shortages vj (which depend on 
assignments u, and the demands dj). 

Our objective will be to minimize total expected costs. 6 Let <t>j(uj | dj) be the mini¬ 
mum costs at a destination if the supply is Uj and the demand is dj. It is clear 
that 

( 10 ) 

where aj is the coefficient of proportionality. We shall now give a result due to 
H. Scarf. 

Theorem: The expected value of <l>s(Uj | dj), denoted by <£/(«/) is a convex function 
of u 3 -. 

Proof: Let p(dj) be the probability density of dj, then 


(ID 


y»+oO 

<f>j(uj) = ocj / (x — Uj)p(x) dx 

Jx**Uj 


r +00 

= ay f xp(x)dx — ajUj 

Jx^Uj 



dx 


whence differentiating 4>(u) 


( 12 ) 


/•-Too 

<h/(uj) = cij / p(x) 

" X""Uj 


dx. 


It is clear that <f>j'(uj) is a non-decreasing function of Uj with <£y"(wy) S 0 and that 
<t>j(uj) is convex. An alternative proof (due also to Scarf) is obtained by applying 
a lemma which we shall use later on. 

Lemma: If 4>(xi , x%, • - - , x n \ 6) is a convex function over a fixed region 0 for 

4 Equation (8) should be viewed more generally then simply as a statement about the 
shortage and excess of supply. In fact, given any Uj and dj, there is an infinite range of pos¬ 
sible values of Vj and sy satisfying (8). For example, Vj might be interpreted as the amount 
obtained from some new source (perhaps at some premium price) and sy the amount not 
used. When the cost form is as in (9), it becomes clear that in order for c to be a minimu m 
the values of Vj and $y will have the more restrictive meaning above. 

6 H. Markowitz in his analysis of portfolios considers the interrelation of the variance 
with the expected value. See [5]. 
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every value of 8, then any positive linear combination of such functions is also convex 
in 0. 

In particular if 8 is a random variable with probability density p(8), then ex¬ 
pected value of 4> 

(13) *(*i, xs , ■ • ■, x n ) = I 4 >(xi,X 2 , ■■■, xJi 8)p(8) d8 

**—00 

is convex. For example from (10), *(«, | df), plotted below, is convex. 



»a/ dj 

(14) 

|\ 


1 \ 

o 

5 

dj Uj 


From the le mm a the result readily follows that <t>j(ufj is convex. 

From the basic theorem the expected value of the objective function is 

@ = X) + 23 aj<j>j(uj) 

j-i 

where 4>,{u } ) are convex functions. Thus the original problem has been reduced to 
m inimi zing (15) subject to (6), (7). 

This permits application of a well-known device for approximating such a 
problem by a standard linear programming problem in the case the objective 
function can be represented by a sum of convex functions. See for example [3] 
or Charnes and Lemke, [2]. To do this one approximates the derivative of <f>(u) in 
some sufficiently large range 0 u ^ Uo by a step function 


(16) 



involving k steps where size of the i th base is a { and its height is h { ; where h ^ 
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h 2 g ... g h k because <f> is convex. An approximation for 4>{u) is given by 

k 

( 17 ) 4>{u) = <t>( 0) + Min 23 


subject to 

k 

(18) u = 23 A f > O^ASOj. 

Indeed, it is fairly obvious that the approximation achieves its minimum by 
choosing Ai = ax, A 2 = a*, • • • until the cumulative sum of the A f exceeds u for 
some i = r; A, is then chosen as the value of the residual with all remaining 
A, +i = 0. In other words, we have approximated an integral by the sum of rec¬ 
tangular areas under the curve up to u, i.e., 

J%u T 

(19) <t>{u) — <t>(0) + <t> 0*0 dx = 23 bifli + h r A r . 

The next step is to replace <t>(u) by 23i > u by23i. A » i 11 programming 

problem and add the restrictions 0 ^ A* ;S a;. If the objective is minimization of 
total costs, it will, of necessity, for whatever value of u = 23x B and 0 = A< = 
a ., minimize 23? • Thus, this class of tw °- sta S e linear programming prob¬ 

lems involving uncertainty can be reduced to a standard linear programming 
type problem. In addition, simplifying computational methods exist when vari¬ 
ables have upper bounds such as Ai ^ a { ; see [3]. 

Example 5: The Two-Stage Problem with General Linear Structure. 

We shall prove a general theorem on convexity for the two-stage problem that 
forms the inductive step for the multi-stage problem. We shall say a few words 
about the significance of this convexity later on. The assumed structure of the 
general 6 two-stage model is 

h — AnXi 


( 20 ) 


b% = AnXi -j- AmXi 


C = 4>(Xx , X 2 1 Ez) 

where are known matrices, bi a known vector of initial inventories. For 
example 


ai = $3 


j-i 


dj = bijXij + Vj s j 

c = 23 23 + 13 a 3°s 

bi an unknown vector whose components are 


here bi = 
here Xi = 
here b 2 = 

here X 2 = 


(&i, a 2 , • • • , 

(a?n , * * * , , #21 , * • * , x 2n j * * * j %mn) 

(di , d 2 , * * * , dn) 

(pi , V 2 , • • • , Vn , Si , S 2 , * • * , $«) 

determined by a chance mechanism. 7 


8 A special case of the general model given in (20) is found in Example 4. 
7 The chance mechanism may be the ‘‘market,” the “weather.” 
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(Mathematically, E% is a sample point drawn from a multidimensional sample 
space with known probability distribution); Xi is the vector of nonnegative ac¬ 
tivity levels to be determined in the first stage, while X 2 is the vector of nonnega¬ 
tive activity levels for the second stage. It is assumed that whatever be the choice 
of Xx satisfying the first-stage equations and whatever be the particular values 
of 6 2 determined by chance, there exists at least one vector Xi satisfying the sec¬ 
ond-stage equations. The total costs C of the program are assumed to depend on 
the choice of Xi , Xi , and parametrically on Ei . The basic problem is to choose 
Xi and later Xi in the second stage such that the expected value of C is a mini¬ 
mum. 

Theorem: If <f>(X 1 , Xi | Ei) is a convex function in Xi, Xi whatever he Xi in 
Qi, i.e., satisfying the 1st stage restrictions and whatever be X 2 in = Q^Xi | b 2 ), 
i.e., satisfying the 2nd stage restrictions given b 2 and Xi , then there exists a convex 
function <fo(Xi) such that the optimal choice of Xi subject tob l = A n X 1 is found by 
minimizing <£ 0 (X 1 ) where 

MX,) = Exp [Inf *(Xi, X 2 1 Ei)}, 

£ 21 ) Xi € Qi 

Exp (7 = Inf MXi)-, 

Xi € Oi 

the expectation (Exp) is taken with respect to the distribution of E 2 and the 
greatest lower bound (Inf) 8 is taken with respect to all X 2 = ^(Xj | E 2 ). 

Proof: 9 In order to minimize the Exp <fr(Xi, X 2 1 E 2 ), it is clear that once Xi 
has been selected, E 2 determined by chance, that X 2 must be selected so that 
<t>(Xi , X 2 1 E 2 ) is minimized for fixed Xi and E 2 . Thus, the costs for given Xi 
and E 2 is given by 

(22) 0i<Xi I E 2 ) = Inf <f>(X 1 ,X 2 1 E 2 ). 

X 2 € O 2 

The expected costs for a given Xi is then simply the expected value of 4>i(Xi | E 2 ) 
and this we denote by <j> o(Xi). The optimal choice of Xi to minimize expected 
costs C is thus reduced to choosing Xi so as to minimize <£ 0 (Xi). There remains 
only to establish the convexity property. We shall show first that <£i(Xi | E 2 ) for 
bounded <t> 1 is convex for Xi in Qi. If true, then applying the lemma, the result 
that <MXi) is convex readily follows. Let us suppose that <t> i(Xi | E 2 ) is not con¬ 
vex, then there exist three points in & : X/, X/, Xi'" = XX/ + M X" (X + m = 
1, 0 ^ X ^ 1) that violate the condition for convexity, i.e., 

(23) Xtf>i(X/ | E 2 ) + /z<£i(Xi" | E 2 ) < ^iCX/ 77 1 E 2 ) 

8 The greatest lower bound instead of minimum is used to avoid the possibility that the 
minimum value is not attained for any admissible point X 2 € 0 2 or X 1 eQ l . In case where 
the latter occurs, it should be understood that while there exists no Xi where the minimum 
is attained, there exists X< for which values as close to minimum as desired are attained. 

9 This proof is along lines suggested by I. Glicksberg. 
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or 

( 24 ) MiiXi' | Et) + | E t ) = 4 *(Xi'" | JST*) — «o > 0 . 

For any «o > 0 , however, there exists Xt and X* such that 

MXi' | Ei) = <t>(Xi', X t ' | Et) - ex 0 ^ « < eo 

(25) 

<h(Xl* I E % ) = <t>(X X 2 " | Et) - €2 0 £ €2 < 60 . 

Setting Xt" = XX 2 ' + mX 2 " we note because of the assumed linearity of the 
model ( 20 ) that (XX 2 ' + p.X/) e 0 2 (XXi' + mXi" \ Et) and hence by convexity 
of <£ 

( 26 ) Xj>(Xi', XJ | Et) + m<£(Xi", Xt* | Et) ^ *(Xi'", X 2 '" | Et) 
whence by ( 25 ) 

( 27 ) \<t>i(Xi | Et) + iuh(Xi* | Et) ^ <t>(Xi", Xt m I Et) — Xei — fiet 
and by ( 24 ) 

( 28 ) <h(Xi" | Et) ^ <t>(X 1'", X/" | #2) — Xei — fiet + € 0 (0 g Xei + pet < ^ 

which contradicts the assumption that <f>i(Xi" | Et) = Inf 0(Xi"', X 2 1 Et). The 
proof for unbounded <t> is omitted. 

Example 5 : The Multi-Stage Problem with General Linear Structure . 

The structure assumed is 

61 = A11X1 

62 = A 21X1 + A 22X2 

63 = A 31X1 + A 32X2 + A33X3 

( 29 ) 64 = A41X1 + A42X2 + A43X3 + A 44X4 


b m = A m lXl + A m tXt + A 77*3X3 +. AmmX m 

c = <t>(X 1, X 2 , • • * , x™ I Et , Ez , ■ • * , Em) 

where 61 is a known vector; 6 * is a chance vector ( i = 2, * • * , m) whose compo¬ 
nents are functions of a point Ei drawn from a known multi-dimensional distri¬ 
bution; Aij are known matrices. The sequence of decisions is as follows: Xi, the 
vector of nonnegative activity levels in the 1 st stage, is chosen so as to satisfy 
the first stage restrictions bi = AnXi ; the values of components of bt are chosen 
by chance by determining Et ; X 2 is chosen to satisfy the 2 nd stage restrictions 
bt = A 2 iXi + A 22 X 2 , etc. iteratively for the third and higher stages. It is further 
assumed that: 

( 1 ) The components of Xy are nonnegative; 

( 2 ) There exists at least one Xy satisfying the j th stage restraints, whatever be 





338 


VIII- 28 —STOCHASTIC DECISION MODELS 


the choice of Xi, X 2 , • • • , Zj_x satisfying the earlier restraints or the 
outcomes h , b 2 , ,b m . 

(3) The total cost C is a convex function in Zx, ■■■ ,X m which depends on 
tne values of the sample points E 2 , E z , • * • E m . 

Theorem: An equivalent (m — 1 ) stage programming problem with a convex pay-off 

function can be obtained by dropping the m* stage restrictions and replacing the 
convex cost function <f> by 


(30) 


Qm—llXl , X, , * ' * , j E, , • • • , Zm—l) 

= Exp Inf 0(Zx, Z 2 , 

Ejn X m € Q m 


,X m \Ez,... ,E m ) 


where Q m is the set of possible X m that satisfy the m th stage restrictions. 

Since the proof of the above theorem is identical to the two-stage case no de- 
taitewin be given. The fact that a cost function for.the (m - 1) stage can be 
obtained from the stage is simply a consequence that optimal behavior for 
the m stage is well defined, i.e., given any state, e.g., (Zx, Z 2 , • • • , Z_x), at 
the beginning of tbs stage, the best possible actions can be determined and the 
minimum expected cost evaluated. Tbs is a standard technique in “dynamic 
programming.” For the reader interested in methods built around tbs approach 
the reader is referred to R. Bellman’s book on dynamic programming [1] 

Wble the existence of convex functions has been demonstrated that permit 
reduction of an m-stage problem to equivalent m- 1, m- 2, • • -, 1-stage problems it 
appears hopeless that such functions can be computed except in very simple 
cases. The convexity theorem was demonstrated not as a solution to an m-stage 
problem but only in the hope that it will aid in the development of an efficient 
computational theory for such models. It should be remembered that any pro¬ 
cedure that yields a local optimum will be a true optimum if the function is 
convex. This is important because multi-dimensional problems in which non-con- 
vex functions are defined over non-convex domains lead as a rule to local optimum 
and an almost hopeless task, computationally, of exploring other parts of the 
domain for the other extremes. 

Solution for Example 2: Shipping to an Outlet to Meet an Uncertain Demand. 

Let us consider the two-stage case given earlier (4). It is clear that, if supply 
exceeds demand (a* > *), that x n = 0 gives minimum costs and, if x u id, 
that x n = di — xn gives minimum costs. Hence ’ 


( 31 ) 


Mb <j> 

xn 


Xn if Xn > di 

£11 + 2 (di — xn) if x n g d,. 


Sbce d, is assumed to be uniformly distributed between 70 and 80 

if xn ^ 70 

,'7K __ « \2 :r . 

10 v 


(32) Exp [Mb <j>] = <( 

d 2 Xn 


—Xu + 150 
77-5 + To^ 75 ~ if 70 < *u g 80 


x u 


if 80 ^ xu 
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This function is clearly convex and attains its minimum 77 . 5 , which is the ex¬ 
pected cost, at Xn = 75 . Since xn = 75 is in the range of possible values of xn as 
determined by 100 = xn + x 12 this is clearly the optimal shipment. In this case 
it pays to ship xn = d 2 = 75 , the expected demand. 

It can be shown by simple examples that one cannot replace, in general, the 
chance vectors 6* by 6*, the vector of expected values of the components of bi . 
Nevertheless, this procedure, which is quite common, probably provides an ex¬ 
cellent starting solution for any improvement technique that might be devised. 
For example, in the problem of Ferguson (application of Example 4 ), using as a 
start the solution based on expected values of demand, it was an easy matter to 
improve the solution to an optimal one whose expected costs were 15 % less. 
Solution for Example 5 : The General Two-Stage Case . ^ ^ (jfc) 

When the number of possibilities for the chance vector 62 is b 2 \ b 2 , * * • , b% 
with probabilities pi , p 2 , * * * , Pk , (2 Vi “ 1), it is not difficult to obtain a 
direct linear programming solution for small fc, say k = 3 . Since this type of 
structure is very special, it appears likely that techniques can be developed to 
handle large k. For k = 3 , the problem is equivalent to determining vectors Xi 
and vectors X 2 (1) , X 2 (2:> , X 2 C3) such that 

bi = A11X1 

6 2 (1) = A21X1 + A 22 X 2 (1) 

( 33 ) 6 2 (2) = A 21X1 + A 2 *X 2 (2) 

6 2 <3) = A21X1 + A 2 2X 2 (3) 

Exp C = Y1X1 + pmX 2 cl) + P272X 2 C2) + PzyzXz ) == Min 
where for simplicity we have assumed a linear objective function. 
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LINEAR PROGRAMMING AND SEQUENTIAL 
DECISIONS*! 


ALAN S. MANNE** 


Cowles Foundation , Yale University 

Using an illustration drawn from the area of inventory control, this paper 
demonstrates how a typical sequential probabilistic model may be formulated 
in terms of (a) an initial decision rule and (b) a Markov process, and then 
optimized by means of linear programming. This linear programming tech¬ 
nique may turn out to be an efficient alternative to the functional equation 
approach in the numerical analysis of such problems. Regardless of compu¬ 
tational significance, however, it is of interest that there should be such a 
close relationship between the two traditionally distinct areas of dynamic 
programming and linear programming. 


1. Summary 

Using an illustration drawn from the area of inventory control, this paper 
demonstrates how a typical sequential probabilistic model may be formulated 
in terms of (a) a decision rule, specifying order quantities as a function of initial 
stock levels, and (b) a Markov process in which the transition probabilities 
depend both upon the decision rule and also upon the probability distribution 
of demands. Optimization of the decision rule is accomplished by means of 
Unear programming. 

In contrast with the linear programming studies of Dantzig [4] and Radner 
[10], the time horizon considered here is infinite rather than finite. For a study 
very closely related to this one, the reader is referred to a paper written by R 
Howard [7], 

The essential idea underlying this linear programming formulation is that 
the “state” variable i (initial stock level) and the “decision” variable j (order 
quantity) are introduced as subscripts to the unknowns x i3 -. These unknowns 
xu represent the joint probabihties with which the state variable takes on the 
value of z and the decision variable the value of j. With an infinite time horizon, 
it is then possible to derive equifibrium distributions (steady state probabilities) 
of inventory levels, production quantities, and shortage levels. The require¬ 
ments of statistical equilibrium furnish the Unear restraints, and the objective 
function to be minimized consists of the expected cost level corresponding to 
the equilibrium probabilities. 


Although the particular application described is a rather speciaUzed one, 
there seem to be quite a number of dynamic programming problems in which 
this computational technique may prove to be an efficient alternative to the 
usual iterative method for solving functional equations. As yet, there is only 
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a limited amount of evidence available for comparing the effectiveness of the 
two approaches from the viewpoint of numerical analysis. Regardless of com¬ 
putational significance, however, it is of interest that there should be such a 
close relationship between the two traditionally distinct areas of dynamic 
programming and linear programming. 

2. Formulation of the Problem 

This is a single-item inventory problem in which the initial stock on hand at 
the beginning of each “month” is, in Bellman’s terminology, the “state vari¬ 
able.” [2, p. 81] The size of initial inventory will be indicated by the subscript i. 
The quantity produced within the month is the “decision variable,” and the 
amount produced will be indicated by the subscript j. Our problem is to obtain 
an optimal sequential decision rule—that is, to specify a value of j for each value 
taken on by i. 

The sum of initial inventory plus the quantity produced will be known as 
the “available stock,” and its size will be denoted by k. 

The quantity demanded during the month is a serially independent random 
variable, n. The symbol p n represents the probability with which n units will be 
demanded. 

The size of month-end terminal inventories will be indicated by t. If backlogs 
of demand are to be ruled out, t = max (0, k — n). 

Once that a decision rule and a demand probability distribution have been 
specified, the inventory process may be regarded as a Markov chain. From this 
chain may be calculated the equilibrium probability distribution of inventory 
levels, of production quantities, and of shortage levels. It will be assumed that 
the decision rule is to be specified in such a way as to minimize the expected 
monthly costs corresponding to these equilibrium probabilities. (Note that this 
objective is closely related to, but by no means identical with that of minimiz¬ 
ing discounted expected costs.) 

The relevant costs here consist of the sum of the expected value of three com¬ 
ponents: (1) those costs related to the initial inventory levels i, (2) those re¬ 
lated to the production quantities j, and (3) those related to the shortage levels 
(n — k). Symbolically, total costs are expressed as follows: 

(1) SCi(i) + &Ca(j) + &C$(n — k ) 

No convexity restrictions are imposed upon any of the three functions Ci(i), 
Ci(j), and Ct(n — k). 1 Convexity is, in effect, brought about by supposing 
that mixed strategies are available. In other words, the conditional probability 
of taking action j (given that the initial inventory is at level i) may lie any¬ 
where in the closed interval between zero and unity. 

1 It is a serious limitation of the Holt-Modigliani-Simon production smoothing model 
that all cost functions must be quadratic. [6] No such assumption is required in the case 
discussed here. 

2 In an accompanying note by Harvey Wagner [12], it is shown that even though prob¬ 
ability mixtures are permissible, there will always be an optimal solution consisting solely 
of “pure” strategies. 
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Some fairly light restrictions are imposed upon the quantities i, j, k, n, and t. 
. irst ’ they must be non-negative integers. Second, there must exist a positive 
integer T, an upper limit upon inventory accumulation, such that: 

t = max (0, k — n) ^ T 

The linear programming problem described below will involve T + 1 equa¬ 
tions In order for the simplex computations to be carried out with present- 
ay electronic machine programs, it would be necessary to choose units .in 
such a way that the integer T does not exceed something of the order of 200. 

3. Some Definitions 

DF: yi = probability that a month’s initial stock equals y { = i.) 

DF: y t = probability that a month’s terminal stock equals t- n't = 1.) 
Statistical equilibrium requires: 

V* = V't (t = 0, 1, • • • , T) 

xa = joint probability with which the initial stock equals i and the pro¬ 
duction quantity equals j. 

£ x iS = Vi (i = 0, 1, • • • , T) 


23 Xij — i 
».y 

Zk = probability that the available stock equals k 

(k = 0, 1, , T) 

i+y-jfc 

Vn = probability that n units are demanded within the month. 

The probabilities p„ are independent of any choices made by the de¬ 
cision-maker. The probabilities x if , y i} y' t , and z k , however are 
directly under his control. (Note that once the joint probabilities 
Xij have been specified, it is straightforward to reconstruct the de- 
ciswn rule—i.e., the conditional probability of taking action j, given 
the initial stock level i. ) 

4. Relationships between the Individual Probabilities 

Smce the random variable n is independent of the available stock k, and 
since t = max (0 ,k — n): 

V' o = 23 PnZk 

k,n : 
k—n ^0 

y i Vn%k 

k,n : 


( 2 ) 

DF: 

(3) 

and 

(4) 

DF: 

(5) 

DF: 

N.B. 


(6) 


(t = 1 , 2 , 
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By (5): 


(7) 


y o — ^ j ipn <£ij 

i+j—n gO 

y\ = 22 VnXij (t = 1, 2, • • •, T) 

i,j,n : 
i+j—n*=t 


By (2) and (3), we finally arrive at the interdependence relationships be¬ 
tween the individual unknowns Xij : 


( 8 . 0 ) 

(8.t) 


22 Xoj = 

i 

22 x t j = 


22 Vn Xij 


i+y-ngo 

22 Vn Xij 

i+j-n=*t 


(i = 1, 2, - • •, D 


Equations (8.0) - (8.T) may each be interpreted as a requirement of statis¬ 
tical equilibrium. In each of these equations, the left-hand side measures the 
probability with which the initial monthly inventory level will be t, and the 
right-hand side the probability with which the terminal level will equal t. Statis¬ 
tical equilibrium implies that these two probabilities must coincide. 

The unknowns in the linear programming model are the joint probabilities 
Xij . The constraints consist of the usual non-negativity conditions upon the 
Xij , together with equations (4) and (8.1)-(8.T). Equation (8.0) is redundant, 
and need not be included explicitly within the constraint set. 


5. Expected Costs 

The cost coefficient associated with each of the will be known as c*y. 
The total cost expression to be minimized by means of the simplex procedure is 
as follows: 

(9) 2D CijXij 

ij 

How do we assign values to the coefficients c%j so as to be consistent with the 
minimand given previously by expression (1)? Note that: 

= 12 y*Ci(i) = 22 XijCi(i) 

i i,S 

8C 2 (j) = 22 XijCi(j) 

ij 

&c 3 (n - k) = 22 S«22 PnC 3 (n - i - j) 

ij 7i 

The cost coefficient c {j associated with the unknown is therefore constructed 
as follows: 

(10) e i} = Ci(i) + Ci(j) +22 VnC 3 (n - i - j ) 

71 


6. A Numerical Example 

In order to construct a numerical example, it is necessary to assign values 
to the demand probabilities, to the three cost functions, and to the upper limit 
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placed upon inventory accumulation. For illustrative purposes, we will work 
with the following: 

Po = f Ci(t) = i T = 3 

Pi = 0 CtU) = 3 j 

Vi = k Cz{n - i - j) = max [0, 6 (n - i - j)] 

In addition, it will be assumed that the production capacity is at most one 
unit per month (i.e., j = either 0 or 1). Note that the mean demand level 
amounts to only f of this capacity limit. There is, however, a § probability that 
demand will actually amount to twice the production limit. 

Table 1 contains a calculation of the cost coefficients for this problem, and 
Table 2 indicates the constraint matrix in detached coefficients form. In tran¬ 
scribing equations (8.1)—(8.3) into this matrix, the right-hand side shown 
earlier m the text has been subtracted from the left-hand side. Equation (8.1), 
for example, has been transformed as follows: 

. Pn*r»y = 0 


TABLE 1 


Calculation of the cost coefficients c%j 


Identification subscripts (t. j) 

(0,0) 

(0,1) 

(1,0) 

(1,1) 

(2,0) 

(2,1) 

(3,0) 

Inventory costs =* C x {i) = i 

0 

0 

1 

1 

2 

2 

0 

Production costs * C 2 (j) = 3 j 

0 

3 

0 

3 

o 

3 

o 

n 

Shortage costs = 

4 

2 

2 

0 

0 

0 

u 

o 

p n Cz(n — i — j) = 








2n Pn max [0, 6 (n - i - j )] 








Total cost coefficient = c</ 

4 

5 

3 

4 

2 

5 

3 


TABLE 2 
Detached coefficients matrix 


Identification subscripts (t, j ). 


Equation (4) 
Equation (8.1) 
Equation (8.2) 
Equation (8.3) 


Optimal activity levels, Xij 
Conditional probability of or¬ 
dering quantity j —given an 
inventory level of i. 

Xij 

'E'Xij 

y 


(0,0) 

(0,1) 

(1,0) 

(1,1) 

(2,0) 

(2,1) 

(3,0) 

Constant 

terms 

1 

1 

1 

1 

1 

1 

1 

- 1 

0 

-i 

i 

1 

0 

-i 

— j 

= 0 

0 

0 

0 

-f 

i 

1 

0 

- 0 

0 

0 

0 

0 

0 

-f 

i 

= 0 








— 

— 

i 


1 

i 

_ 

8* 


0 

l 

0 

1 

l 

0 

1 * 



as 


-- KJJ. 

e, a “small” positive quantity. 
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Also shown in Table 2 is the optimal linear programming solution to the 
problem. According to this calculation, the initial inventory will be at a zero 
level during § of the months, at a unit level f of the time, and at a level of two 
during the remaining -f. 3 The conditional probabilities derived from this solu¬ 
tion indicate the following decision rule: Whenever the initial inventory has 
dropped to a level of either zero or unity, one unit of production is ordered. 
At higher initial levels, no production takes place at all. Note that no mixed 
strategies are indicated despite the fact that this option was built into the model. 

7. Some Observations 

(1) There are a number of paths by which one may prove that it will always 
be optimal to adopt pure strategies. One way is sketched out in the accompany¬ 
ing note by Harvey Wagner. Another—and perhaps a more intuitive way is 
to follow the line of reasoning by which Dvoretzky, Kiefer, and Wolfowitz 
dismiss mixed strategies in a problem of this sort. This is a problem in which 
the demand probabilities p n are known in advance to the decision-maker, and 
do not have to be estimated by him. [5, p. 191 n.] Hence the conclusion that 
in a two-person game in which the decision-maker has “found out” his oppo¬ 
nent’s strategy, it will never hurt him to restrict his own choice of strategies to 
pure ones. 4 

(2) The choice of an upper limit, T, upon inventory accumulation is admit¬ 
tedly an arbitrary one. If, after finding an optimal solution for a given value of 
T , and observing that x T -jj = 0 for all j, it is entirely possible that a further 
increase in the value of T will lower the mimmand still further. It is a simple 
matter to construct pathological cost functions that will yield this result. Lest 
the reader become too concerned over this potential snare, it is worth pointing 
out that there are a number of applications in which there exist very real upper 
limits upon the accumulation of inventory, e.g., the reservoir capacity of a hy¬ 
droelectric system. 

(3) It is not altogether legitimate to have brushed aside the question of 
initial conditions for the Markov process. If the optimal matrix in the linear 
programming solution is a “decomposable” one, the initial conditions will 
clearly govern the ultimate statistical equilibrium. The most direct way to 
circumvent this difficulty would be to assume that the initial conditions lie 
within the control of the decision-maker—at least to the extent that he may 
choose them so as to start off within any one of the subsystems into which the 
larger system splits up. 

(4) It is possible to attach an economic interpretation to the implicit prices 
(dual variables) associated with the linear programming solution. They represent 

3 The average monthly cost associated with this solution equals (f) (5) + (-§) (4) + 
(-4) (2) » 31/9. It is of some interest to compare this cost level with that of the do-nothing 
basic feasible solution—one in which the unknown £oo equals unity, all other unknowns 
are set at zero, and the resulting monthly costs amount to 4. 

4 1 am indebted to J. Marschak for having pointed out the applicability of this line of 
reasoning to the problem at hand. 
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the amount by which total costs would be altered if the initial inventory were 
at the t th level rather than at zero. 5 Apparently, they are related to the solution 
°f Bellman’s functional equation for the inventory problem. [2, pp. 159-164] 
This being so, it should be a comparatively simple matter to use them in order 
to link together a non-stationary finite-horizon model with a stationary one 
having an infinite horizon. 


8. Areas of Application 


Among the applications that suggest themselves, the following stochastic 
models would seem to be of the most interest: 

(1) Changes in the rate of production. A number of studies have been con¬ 
cerned with systems in which the costs depend not only upon the rate of pro¬ 
duction (as in the example above), but also upon the rate of change of that 
level. (E.g., [6].) This kind of problem could be attacked through the same 
methods outlined here by defining the “state variable” i as a pair of numbers: 
one representing the initial inventory level and the other the rate of production 
during the immediately preceding period. With this one change in interpreta¬ 
tion, things would proceed in essentially the same way that has been suggested 
here. The only serious difficulty might arise from the computational costs in¬ 
volved in an increase in the number of equations within the linear progr amming ; 
model. Instead of just one equation for each of the (T + 1) levels of inventory, 
there would now be r equations—one for each of the r discrete rates of produc- 
tion that were considered. Altogether, the programming matrix would contain 
r-(T + 1) rows. 

(2) Seasonal storage of inventories. Several recent papers have been focussed 
upon the problem of optimization under conditions of seasonally fluctuating 
demands (e.g., the demand for heating oil [3]) or of supplies (e.g., the supply 
of water for hydroelectric installations [9]). In order for a linear programming 
model to reflect such seasonal fluctuations in the probability distribution of 
demands or of supplies, the state variable i would again have to represent a 
pair of numbers—the first indicating the season of the year and the second the 
inventory level at the beginning of the particular season. The conditions of 
statistical equilibrium would then imply equality between probabilities for the 
terminal inventories of one season and the initial inventories of the one following. 
With s seasons and (T + 1) inventory levels in each, a total of s(T -f 1) equa¬ 
tions would be involved. Even with time subdivided into 12 individual months 
and with 10 levels of inventory considered during each month, the computa¬ 
tional requirements would still remain modesb-a 120-equation system. 

(3) Multi-location inventory problems. In the event that inventories are 
scattered among several geographical locations, it may no longer be appropriate 
to describe the system m terms of a single state variable—the aggregate quantity 


solution t sho 1 wn in Table 2, the implicit prices associated with 
equations (8.1)-(8.3) are, respectively, -7/3, -13/3, and -11/3. These values serve to 

l“vd oTl 2 6 C °7 a T Ve advantage of boning the Markov process with an inventory 
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held in stock. Instead, a separate quantity must be specified for each location. 6 
If, then, there are stocks held at l different locations, the state variable i will 
have to be regarded as an Z-tuplet of numbers. With (T + 1) alternative in¬ 
ventory levels at each individual location, the linear programming model would 
contain no less than (T + l) 1 distinct equations. As far as any realistic problems 
are concerned, it must be conceded that this number of equations could become 
hopelessly large. Even with just four locations and five inventory levels at each, 
the system would contain 625 equations! The most obvious way to reduce the 
size of such problems would be to devise some judicious scheme for aggregation 
into a manageable number of geographical areas. 

(4) Delivery lags. Each of the cases described thus far has been based upon 
the assumption that delivery lags are short—that any production ordered at 
the beginning of a period will be available to satisfy whatever demand takes 
place within the period. With long delivery lags, these models hardly seem to 
be appropriate. 

A number of authors [1, 8] have shown, however, that there is a simple way 
to analyze a problem in which there are long but fixed delivery lags—that is, 
no randomness in the time required for delivery. (This formulation guarantees 
that all currently outstanding orders will have been received prior to the arrival 
of any order placed currently.) In addition to non-random delivery lags, these 
authors also assume that a shortage in supply is reflected in a temporary backlog 
rather than in a permanent loss of demand. 

With these assumptions, the appropriate state variable required in order to 
describe the system is no longer the actual inventory on hand, but rather the 
sum of that inventory plus all outstanding orders. To adapt this suggestion to 
the linear programming model discussed here, all that needs to be done is to 
reinterpret the state variable i as “stock on hand plus orders outstanding.” 
The probability p n would be regarded as the probability that n units were 
demanded during whatever time interval is required for the delivery of an order. 
This variant upon the inventory model is equally well adapted to the case in 
which time is regarded as a discrete or as a continuous parameter. 

9. An Unresolved Difficulty 

The minimand employed here represents the average level of costs per unit of 
time, and completely ignores the dating of these costs. Time discounting is 
neglected—j ust as in many other treatments of the inventory problem. The 
only justification for this procedure must be that the mean interval between 
successive recurrences of any given inventory level—that this mean interval is 
short relative to the discount factor. 

In cases involving equipment analysis, however, this simplification seems 
quite unpalatable. The interval between successive replacements of a piece of 
equipment is likely to be measured in years rather than months [11]. With 

• Essentially the same problem arises if, instead of one commodity in several locations, 
we are concerned with planning for several different commodities at a single location. 
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such models the “present worth” form of minimand appears essential. It will 
e of considerable interest to see whether the current linear programming 
formulation of Markov processes can be extended to the case of time discount- 

in or 
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CHANCE-CONSTRAINED PROGRAMMING*! 1 

A. CHARNES 2 and W. W. COOPER 3 

A new conceptual and analytical vehicle for problems of temporal planning 
under uncertainty, involving determination of optimal (sequential) stochastic 
decision rules is defined and illustrated by means of a typical industrial ex¬ 
ample. The paper presents a method of attack which splits the problem into 
two non-linear (or linear) programming parts, (i) determining optimal prob¬ 
ability distributions, (ii) approximating the optimal distributions as closely 
as possible by decision rules of prescribed form. 

Introduction 

The problem of stochastic (or better, chance-constrained) programming is 
here defined as follows: Select certain random variables as functions of random 
variables with known distributions in such a maimer as (a) to maximize a func¬ 
tional of both classes of random variables subject to (b) constraints on these 
variables which must be maintained at prescribed levels of probability. More 
loosely, the problem is to determine optimal stochastic decision rules under 
these circumstances. An example is supplied in [2]. Temporal planning in which 
uncertainty elements are present, but in which management has access to “con¬ 
trol variables” with which to influence outcomes, is a general way of character¬ 
izing these problems. Thus, queuing problems in which the availability of 
servers, customers, or both are partly controllable fall within this classification. 
It should be noted, that the constraints to be maintained at the specified 
levels of probability will typically be given in the form of inequalities. . 

The method of attack which will be outlined in this paper consists of splitting 
the problem into two parts: (i) determining distributions which maximize the 
functional, subject to the probability constraints; (ii) approximating the distri¬ 
butions so determined as closely as possible (in some sense) by functions of the 
known random variables of some prescribed or admissible class. The functions 
so determined can be regarded as approximations to the optimal stochastic 
decision rule from the admissible class of such rules. 

Specifically, for discrete distributions, and piecewise linear functionals along 
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nrptlTi T^ e \T-™ g the random variables—to be maintained at 
prescnbed leveJs of probability-we factor the original problem into two new 

p oblems. (1) a problem which determines the coefficients of the step functions 
comprising the optimal discrete distributions—or, alternately, the discrete 
probability frequencies-and (2) another problem which determines the pa- 
rameters of the optimal decision rule in the sense of a “best approximation”. 

1° fix the ideas, we present first a formulation of a partly controllable situa- 
tmn wh! J has heretofore been formulated and treated (inadequately) by queuing 
models. We consider teminal tankage facilities supplied by a refinery and with 
pickup by tankers. 


Let 

Rj = amount of oil sent to the tankage facilities in the jth period 

~ f m ^ nt 7 ° n Which is picked up from the facilities in the jth period 
by the ath type of tanker. 

Is = inventory on hand at the beginning of the jth period. 

T a - tankage (i.e., storage facilities) available. 

Thus, 


( 1 ) 


Ij ~ Io + § R l ~ (Z Sl“)J 


Sffiawal inVent0ry h t0 ^ initkl inventory > h > and rates of input and 

The objective is to minimize the expected total cost of input and withdrawal 
including such features as the cost of changing input rates, demurrage, charter 
and dispatch. Formally this may be stated as 

minimize ^fC(R, S)] 

Sr P ln h . e 7i Ct0rS R a “ dS i ndicate that cost is t0 be considered over all periods. 
The expectation sign, E, indicates that these vector variables are to be considered 
stochastically in assessmg the expected value of the total costs C 

p ret S„rS eZT° n °' ^ “ d 

(2a) C ( R, S) = £ C } (Rj) + £ w 3 -1 R, - R^ | + £ d,- A r 

3 3 

where 


and 


!j + Rj — T,Sf = A? -Ar 

CL * 

Aj + , A~ £ 0 . 


Af = 

\A S 

1 - Aj 

fo when Aj ^ 0 



2 

\— Aj when Aj < 0 

= 

U, 

1 + Aj 

(0 when Aj <L 0 


2 

when Aj > 0 
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In this case 

Cj(Rj) refers to costs of handling due to the amount Rj , 
j Wj | Rj — Rj -1 | is the component of total costs due to changing the 
rate in period j, 

7; , djAf are demurrage costs, 

and other details may be added as necessary. 

In this model the Sj are random variables. However the random components 
are stated in terms of deviations from scheduled amounts. In addition, the model 
is of a conditional stochastic variety so that past data, developing experience 
and forecasts of the future all enter into determining the optimum Rj . 

The direct problem is stated in terms of meeting the objective specified in (2) 
subject to the following constraints. 


(3a) 

Pr {Ii + Rj 2= T,Sn 

a 

j = 1, 2, 

(3b) 

Pr {I,- + Ri g T a ] ^ t ,,, 

3 = 1, 2, 


where “Pr” means “probability” and pj and yj are the prescribed confidence 
levels desired for each period j = 1, 2, * • * . 

The long range problem is concerned with evaluating T A in terms of the 
effects on expected cost (7. Notice, however, the two significant features asso¬ 
ciated with this evaluation in (3b): One, there is a valuation element associated 
with varying T A while the y j are fixed. Two, there is a valuation element asso¬ 
ciated with cost effects on the risks of not meeting schedules or varying R as 
different levels of confidence are specified. The former is associated with cost 
reductions (or increases) arising from varying T A (and hence R and S in response 
thereto) at given levels of risk. Hence both risk and service may be evaluated 
in various combinations when studying the alteration of tankage. 

We next illustrate the factoring procedure by means of the following simpler 
example: 

max E Z ~ (c; + Tj)Rj - A, 

subject to 

(i) Pr \Ij + Rj ^ Sj + Imin } Otj 

(1) (ii) Pr \I 5 + Rj - Sj s /max} ^ Pj 

(iii) Rj = 0 

where, in period j = 1 , 2, * * • , N, 

Ij = Inventory on hand at start 
Rj = Production rate to be scheduled 
Sj = Sales demand 

/ m in — Minimum inventory to be maintained 
7 max = Storage capacity 
Kj = Inventory carrying charge 

Cj , Tj = Production and Transport Cost (per unit), respectively, 

Pj(Sj) = Unit sales price as function of sales demand 
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and E indicates expectation. I.e., the objective is to maximize the expected net 
return over an N period planning horizon subject to the probabilistic constraints 
which are to be honored in each of they = 1, 2, — , i\T periods. 

This problem indeed, a more general one involving a convex functional— 
was treated in [2] by means of a restricted class of decision rules which made it 
possible to transform the problem into a deterministic one (involving certainty 
equivalents) which could be solved by a specially developed convex programming 
algorithm. The purpose of the present paper is, by contrast, to suggest a new 
analytic method which offers the possibility of handling a much wider class of 
decision rules. 

For this example as well as for the class of rules which will be considered— 
the observable stochastic variables are independent. Because the decision on Rj 
must be made before Sj is observed, the admissible class of decision rules for R 3 
(or A f = Ij + Rj) can involve, as random variables, only Si , • * • , Sj-i . 
The Aj may thus be considered statistically independent of the respective Sj . 
This means that the distribution of Aj — Sj is given by a convolution of the 
distributions of Aj and —Sj . 

In considering the first of the two parts into which the problem is to be split 
we shall transform it into a mixed integer programming problem for determining 
the relative frequencies of distributions for the Aj . By definition of /y and Aj 9 

£ 2 ) = Aj — Aj^i + Sj—i and 

Ij = Aj~ i — Sj-i 

Thus, by well-known properties of the expectation operator, E, the maximand is 
reduced to a linear function of the (yet-to-be determined) relative frequencies 
of the Aj . 

Similarly, the expression 


(3) Pr {Aj - Sj ^ x) = f}j(y)gj{x - y) dy, 

where/ 3 - and gj are the density functions respectively, for -Sj and Aj , is a 
linear function of the Xy r , the (unknown) relative frequency of the rth possible 
amount for Aj . Thus, (1-i) and (1-ii) go over into linear inequalities involving 
the Xy r . To these we must append the conditions 


(4) 


^Ll Ajr — 1 
r 


Ay r ^ 0 

so that the Xy r may be interpreted as relative frequencies. Also, since (1-iii) 
may be rewritten 


Aj Z Ay_! - Sj—i 

we may interpret it as 

( 5 - 2 ) min Aj ^ max (Ay_i - Sj-0 
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in order to transform the non-negativity requirement to a condition on the 
frequency functions. 

As is known, the density function for Aj-i — Sj-i has its relative frequencies 
as linear functions of the X 3 -_ Jir ’s; let these be denoted by . The require¬ 

ment (5.2) can then be expressed by 

hji g i = 1, 2, n 

*—i 

j = 1 , 2 , • • • , N 

( 6 ) 

X>,v ^ h it s = 1, 2, • • • , n 

Tmml 

0 ^ hji s 1 

and h 3 i shall be an integer. 

We have thus transformed the first part of the problem into the form: maxi¬ 
mize a linear function of the \y r subject to the linear conditions given by (4) 
through (6), plus the requirements that the fey/s shall be integers. General 
methods for such mixed-integer problems have been provided by E. M. L. Beale 
[1] and R. Gomory [3]. 

The solution to the first part thus leaves us with a solution to a problem which 
is less restricted than the one originally stated. It should also be obvious that 
more general piecewise linear functionals can be comprehended via this mode of 
attack (with at worst mixed integer requirements) and that more complicated 
linear stochastic constraints may be handled where suitable variable transforma¬ 
tions—e.g., analogous to those from the By, /y to the A 0 —permit a translation 
into convolutions (hence linear inequality conditions) for the unknown relative 
frequencies. 

Knowing now the solution to the first part, we next seek to approximate as 
closely as possible the distributions for the A,- by means of functions 

(7) A, = AyOSi, * • * , Sj~i) 

where the functions are from some specified admissible class; the class of possible 
stochastic decision rules any one of which will prescribe the value of A, given 
Si, • • • , i . 4 For example, the functions 

y-i 

(8) Aj = CL j 0 + a ir Sr 

comprise the class of linear decision rules. 

For this class—i.e., (8)—the problem of approximation is probably best carried 
out in terms of characteristic functions. E.g., the characteristic function corre¬ 
sponding to the decision rule for A 3 —i.e., D(Aj )—is given by [4] 

i—1 

(9) 0Z)(A,) (t) = XI 05 r (oLjr t), 

v ' r=0 


4 This class may possibly be extended to include still other variables with known dis¬ 
tributions that might improve the fit. 
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a product of the known characteristic functions of the S T involving the unknown 
a* in their argument. By choice of the oc Jt we seek to approximate <f> A .(<) the 
characteristic function for the distribution obtained as a solution to the first 
problem, m such a manner that the distribution (or density) function corre¬ 
sponding to <t> D(Aj) (0 has (as closely as possible) the desired characteristics of 
the distribution of A,. 

Evidently there are many ways of specifying the latter problem. For example 
one may approximate some subset of the cumulants, or semi-invariants; this is 
equivalent to approximating the mean and (or) other selected moments (and 
associated characteristics) of the distribution of A j as closely as possible. Another 
possibility is to minimize 


j£ w IX <l>sr {cLjr t )J dt. 

By ParsevaPs Theorem this integral is equal to 

2w jL ~ Pi>Uj)(x)] 2 dx 

where p Aj (x) is the density function corresponding to 4 > A , (t) and p D(A ) (x) is 
the density function corresponding to the decision rule. Still another possibility 
would be to weight the dispersion of p D(Aj) (x) from p Aj (x) by the relative fre- 
quencies of p Aj ( x )—e.g., to minimize 

/ oo 

m Pl,(x)[p Aj (x) - p DUj) (x)] 2 dx. 

By Parseval’s Theorem, and the fact that the Fourier transform of the product 

of two functions is the convolution of their individual transforms, this is equiva- 
lent to ^ 


minimizing [{*!,(**, - <j > D(Aj) )}<«]* dt, 

where the * denotes the convolution operation. AH three of these possibilities 
are classical non-linear minimization problems since the a, T are completely 
unrestricted. J 

As should now be clear, the problem of stochastic (chance-constrained) pro¬ 
gramming mvolves difficulties of an order incommensurate to that of “certainty” 
programming. These difficulties stem fundamentally from the probabilistic con¬ 
straints, which experience (let alone theory) has made clear, are not adequately 
represented as some have done by applying the expectation operator to the 
stochastic form It is hoped that the conceptual framework and approximation 
i eas above will stimulate additional research on models and methods of this 
character which are essential to insight into and progress on management prob¬ 
lems of a temporal nature involving conditional decisions. 
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ON THE OPTIMAL INVENTORY EQUATION 

R. BELLMAN, I. GLICKSBERG, and 0. GROSS 
The Rand Corporation 

1. Summary 

The purpose of this paper is to discuss a number of functional equations 
which arise in the “optimal inventory” problem. This is a particular case of the 
general problem of ordering in the face of an uncertain future demand. 

Actually, an important aspect of the problem is that of determining a suitable 
criterion of cost, one which is both realistic and analytically malleable. 

In the following sections we shall consider various sets of assumptions which 
yield various functional equations, all belonging to a common family. 

2.1 Finite Total Time Period 

The first process we shall consider is one involving the stocking of only one 
item, where we may order at each of a finite number of equally-spaced times and 
we must fulfill the demand at these same times or pay a penalty. We shall fur¬ 
ther assume that there is no delay in filling an order or a demand. 

It is important to emphasize that we have made the, in many cases, unrealistic 
assumption that the distribution of demand is the same at each stage of the 
process. Fortunately, as is easily verified following the argumentation below, 
although this changes the parameters describing the optimal policy, it does not 
affect the basic structure of the optimal policy, namely constant stock level. 

Let us suppose that we know completely the following functions. Again in 
realistic situations, the determination of these functions may constitute a major 
difficulty. 

For the case where the cost functions are taken proportional to the quantity 
ordered we obtain complete solutions for the case of an arbitrary number of items 
and arbitrary distribution of demand. 

2. Introduction 

In this paper we wish to consider a number of interesting analytic problems 
arising in the study of inventory and stock control. The origin of this work is a 
paper by Arrow, Harris, and Marschak, [1], which contains the first mathematical 
formulation of problems of this genre. Following this are the detailed papers of 
Dvoretzky, Kiefer, and Wolfowitz, [4], [5], containing existence and uniqueness 
theorems for the class of functional equations which arise, and a discussion of 
statistical estimation problems connected with the determination of demand 
functions. Up to the present, relatively few complete solutions of general classes 
of these problems have been given and little seems to have been published on the 
more realistic problems involving stockpiles of many different items with cor- 
related demand functions. 
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In the pages that follow we shall obtain complete solutions, in the sense that 
the structure of the optimal policy will be completely determined, for some sig¬ 
nificant cases where simple, but quite realistic and useful, assumptions are made. 

The usefulness of these explicit solutions is great. Apart from the fact that they 
may be used to obtain approximate solutions to problems of more intricate type 
these exact solutions frequently lay bare the combinations of essential parameters 
which are most meaningful. 

Quite often, the analytic representation of the solution possesses a quite elegant 
and simple economic interpretation which, when verbalized, permits one to ob¬ 
tain a good approximation to the optimal policy in more complicated processes, 
cf. (2), for further discussion of this point. 

It is, furthermore, often true that the determination of an optimal policy may 
depend upon far less than would seem to be the case at first sight. Thus, in one 
olaps of problems in which we wish to minimize expected cost, it turns out that 
we need only know expected outcomes, cf. (2). Another example of this phe¬ 
nomenon occurs below where only the cumulative distribution of demand plays a 
role in determining optimal policy. 

Apart from the results we obtain, the methods we employ possess an inde¬ 
pendent interest. They have already been employed in connection with other 
flasaps of functional equations in the theory of dynamic programming, cf. (2), 
and appear quite useful in applications in other fields such as the calculus of 
variations. What stands out quite vividly is that the method of successive ap¬ 
proximations is not only useful in the production of existence and uniqueness 
theorems, to which dull task it is usually relegated, but is a powerful analytic 
tool for the discovery and proof of properties of the solution of a functional 

equation. . . 

The paper is divided into three parts. The first part contains a discussion of 
the characteristic features of the problem we shall discuss and the assumptions 
we shall make, together with a derivation of the functional equations which 
arise from various combinations of features and assumptions. 

In §3 we consider the problem of existence and uniqueness of solutions of these 
equations, and the convergence of successive approximations. In particular, we 
show that we can always obtain monotone convergence by choosing an initia 

approximation in “policy space”, cf. (2). 

Although existence and uniqueness have been treated by DKW, (4), we feel 
that it is worthwhile to present another proof here since the convergence of the 
successive approximations and the uniqueness of the solution play paramount 
roles in our further discussion. Our proof is distinct from theirs. 

In §4 we present the simple observation which guides all our subsequent 

analysis. , . , 

Part II is devoted to a discussion of a number of models in which the optimal 
policy is characterized by the principle of constant stock level. In particular, 
this is the case, in the multi-dimensional as well as the one-dimensional case, 
if all ordering costs are directly proportional to the amounts ordered. We also 
consider a number of cases where the penalty cost for ordering to meet an ex- 
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cess of demand over supply contains a fixed administrative or “red tape” cost 
This is the usual model which has been treated. Here the results are less complete 

due to the fact that the optimal policies seem to have much more complicated 
structures. 

If we introduce “red-tape” or “set-up” costs which are independent of the 
quantity ordered, the problems that arise become much more formidable and 
escape our methods. Furthermore, it can be shown, by means of examples, that 
tne structure of the optimal policies changes radically. 

Part III considers two processes with more complicated op timal policies. 
One arises from the consideration of a convex cost function for initial ordering 
and the other from a time-lag between order and delivery. The solution to both 
problems is obtained by means of the use of successive approximations. 


Part I—Mathematical Formulation 


3. Formulation of the General Problem 

The problem we shall discuss in various related forms is a particular case of the 
general problem of decision-making in the face of an uncertain future. The ver¬ 
sion we shall consider is concerned with the problem of stocking a supply of items 
to meet an uncertain demand. 

The situation is as Mows: at various specified times we have an opportunity 
to order supplies of a certain set of items, where the cost of ordering depends 
upon the number ordered of each item, and where there may or may not be some 
fixed administrative costs which are independent of the number ordered. At 
various other times, demands are made upon the stocks of these items. The 
interesting case is where these demands are not known in advance, but where we 
do know the joint distribution of demands. The incentive for ordering lies in a 
penalty which is assessed whenever the demand of an item exceeds the supplv 
Different penalties are levied in different fields of activity which means that a 
number of different models must be considered. An important case is where the 
penalty is directly proportional to the excess of demand over supply. 

Speaking loosely, we wish to determine the ordering policy at each stage which 
wfil minimize some average function of the total cost of carrying on the activity. 

(a) *(«) ds = the probability that the demand will he between s and 
s -f ds. 1 

(1) (b) k(z) = the cost of ordering z items initially to increase the stock 
level. 

(c) p(z) = the cost of ordering z items to meet an excess, z, of demand 
over supply, the penalty cost. 


Let z denote the stock level at the initiation of the process. Assuming that 
there are n stages we will order a quantity Vl at the first, where Vl depends upon 


be leTa ttafalToifr r! ie n eSint6gralS thr ? ghout to sim P lif y the discussion. It will readily 
to ZsiS™ Ca f y over t0 the general situation with suitable attention 

to possible non-uniqueness of roots of certain equations we shall derive. 
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x, 2/2 at the second stage, where 2/2 depends upon the new stock level, and so on. 
A set (Yi, Y 2 , • * * , Y n ) of functions, determining for each k ^ n the amount 
Y k = Y k {x k ) to be ordered at the k th stage as a function of the stock level x k 
will be called a policy. Corresponding to each policy, there will be a certain ex¬ 
pected total cost for this n-stage process, consisting of initial ordering and 
penalty costs. 

The problem we set ourselves is that of determining the policy, or policies, 
which minimize the expected total cost. A policy which yields this minimum 
expected cost is called optimal. 2 

At any stage, the problem is characterized completely by two state variables, 
x , the supply of stock, and n, the number of remaining stages. Let us then define 


( 2 ) 


f n (x) = expected total cost for an n-stage process starting with an initial 
supply x and using an optimal ordering policy. 


Let us now proceed to obtain a functional equation for / n (a0- We have, for the 
one-stage process, a cost equal to 


(3) 


k{y — x) + 


/■ 


p(s - y)<t>(s) ds, 


if a quantity y — x ^ 0 is ordered. 

Since y is to be chosen to minimize the expected cost, we see that fi(x) is 
given by 

(4) fi(x) = Min [" k(y — x) + f p(s — y)<l>(s) ds . 


In general, for n ^ 2 we nave 

f n (x ) = Min \k(y — »)+/* p($ ~~ 2/)$($) ds + /n-i(0) f ds 

y^x L J y Jy 

J fn— l(y $)$($) dsj, 


(5) 


+ 


upon enumerating the possibilities, cf. AHM, [1]. 


2.2 Unbounded Time Period—Discounted Cost 

If we wish to consider an unbounded period of time over which this process 
operates, we must introduce some device to prevent infinite costs from entering. 

The most natural such device is that of discounting the future costs, using a 
fixed discount ratio, a, for each period. This possesses a certain amount of eco¬ 
nomic justification and a great deal of mathematical virtue, particularly in its 
invariant aspect. 

If we set 

(6) f( x ) = expected total discounted cost starting with an initial supply x and 
using an optimal policy, 


* Another criterion, of probably greater importance, which we shall not discuss here, is 
that of minimizing the probability that the cost exceeds a fixed level. 
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we obtain, by the same enumeration of possibilities, in place of (5) the functional 
equation 

f(x) = Min \k(y - x) + a f p(s - y)<t>(s) ds + af( 0) f 0(s) ds 

^ V £ X L Jy Jy 

+ a J o fiy — s)<f>(s ) . 

The advantage of (7) over (5) lies in the fact that it contains/(x), one function 
of one variable, in place of a sequence of functions, {f n (x )}. 

2.3 Unbounded Time Period—Partially Expendable Items 

If we assume that some of the items supplied upon demand may be partially 
recovered, so that a demand of s items results in a return of bs items, 0 ^ b ^ 1, 
which may be used again, the analogue of (7) is 

fix) = Min \k(y - x) + a f p(s - y)<j>(s) ds + a C f(bs)<f>(s) ds 
’*• L Jv J v 


+ a f f(y — s + bs)<j>(s ) ds . 

Jo 


2.4 Unbounded Time Period—One Period Lag in Supply 

Let us now assume that when we order a quantity 2 it does not become avail¬ 
able until one period later. If the current supply is x and y was on order from 
the period before, x + y will be available to meet the next demand. The func¬ 
tional equation corresponding to (7) is now of more complicated form 

f(x) = Min \kz + a ( p(s - x)4>(s) ds -(- af(z) f <f>(s)ds 
r<u * s ° L Jx h 


+ a f f(x — s + z)<j>(s)ds . 

Jo 


2.5 Unbounded Time Period—Two Period Lag 

If we have a two period lag, we have two-stage variables which describe the 
state oi tne process, 

(1) X = quantity of stock available to meet the next demand 
y = quantity to be delivered one period after that. 

Hence we define 

(2) f(z, y) — expected total cost with x and y as above using an optimal policy. 
Then f(x y y) satisfies the equation 

fix, y) = Min + a p(s - *)*(*) ds + af{y, z) J~ *(,) ds 


+ a i: fix — s + y, 2 ) 0 ( 5 ) dsl. 


(3) 
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We shall not consider this equation here, although it is amenable to the same 
techniques we apply to the others. 

3. Existence and Uniqueness Theorems 


There is a uniform technique, the method of successive approximations, first 
exploited by Picard, for obtaining results concerning the existence and unique¬ 
ness of solutions of functional equations. It is particularly important for our 
discussion since we shall consistently determine properties of the solution by 
demonstrating that all the approximations possess these properties. 

Let us illustrate how one could obtain existence and uniqueness for the general 
class of equations of which the above are particular examples by considering 
equation (7) in section 2.2. A more extended treatment of this general class of 
equations may be found in (3). 

To simplify the notation, let us set 


( 1 ) 


T{y, x , /) = k(y — x) + a f p(s - y)(l>(s) ds + af( 0) f <fr(s) 

Jy Jy 

+ a I f(y — s)<f>(s) ds. 

Jo 


ds 


Then equation (2.7) has the form 

(2) fix) = Min T{y, x,f). 

Let us impose the following conditions 

(a) 0, f <t>(s) ds = 1, 

Jo 

(3) (b) p(s) is continuous, monotone increasing , and p(s)<p(s) ds < o°, 


(c) k(y) is cmtinuous for y ^ 0 and k( 0) = 0, 

(d) 0 < a < 1. 

Under these conditions, we have the result 

Theorem 1. There is a unique hounded solution to (2). This solution, f(x), is 
cmtinuous * Let f Q (x) be any nm-negative hounded continuous function defined 
over 0 £ x < <*>, and define the sequence { fn ( x )\ as follows for n = 0, 1 , * * * , 

(4) = Min [2%, «,/„)]. 


Then 


f(x) = lim f n (x) 

n-+ oo 

Proof: Let us begin by showing that the sequence/„(x) is uniformly bounded. 
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If |/o(z)| ^ M for x ^ 0 , we have 

l/i(z) | ^ T(x, x,f c ) g a r P (s)<j>(s) ds + 

J o 

® a M jjf 4>(s) ds + j" <j>(s) 

= a l P(s)4>(s) ds + aM. 

From this we see by means of an inductive argument that 

® I /«(*) I ^ (a l P(s)4>(s) ds + m)! (1 - a). 

Now to establish convergence. For each n, we see, as a consequence of our 
assumptions that f n (x) and T(y, x, /„) are continuous functions of x and y for 
z. Let y n - y n ( x ]l be a value of y for which T(y, x, f n ) attains its minimum. 
This value is never infinite, since T («, *, f n ) = oo for all a; 2 : 0 . For the sake 
of definiteness, let y n be the smallest such value. 

We have then 


(7) 


fn+i = T(y n , x,f n ) g T(y^ , x ,f n ) 


fn = T(y n _ j, x , /„_!) ^ T(y n , x , / B _0 
Combining these inequahties we have 

*n+i fn = T(y n , x, f n ) - T(y n , z, f^) 


( 8 ) 

whence 


— T(y n —i ? x, f n ) T(y nr _i , x , / n -l), 


(9) 


Hence 


( 10 ) 


I fn+i -fn\S Max [a jf | f n (y n - s ) - _ s ) | ^ ds 

+ « I fn( 0) - fn-M I J <t>(s) ds, a f | /„(?/„_! - s) 

*'Vn JQ 

- fn-i(y n -i - s ) I <Ks) ds + a | / B (0) - /^(O) | jf ^( s ) dsj 


Max 

O^x 


I /«+i - /» | g o Max | / B - /„_! | f 0 ( 5 ) 

O^x Jq 


ds 


= a Max | |. 


Ogx 


Consequently, the series L-o (Wz) - /„(*)) converges uniformly for all 
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x ^ 0 and f n (x) converges to f(x), a bounded solution of (2). Since each element 
of the sequence {/ n (#)} is continuous, f(x) is continuous. 

To establish uniqueness, let F(x) be another bounded solution of (2) and use 
the same technique as above (8) for the two equations 

F(x) = Min T(y, x, F) 

/ v 

(id 

f n+ i(x) = Min T(y, x, /„). 

We readily see that Fix) — /„(x) —» 0 as n —» <». Hence F(x) = fix). 

Observe that if we take 


p(s — y)4>(s) ds 

f 2 (x) = Min T(y, x, /i), 

and so on, we readily obtain / 2 ^ /i, and thus monotone increasing convergence. 

On the other hand, if we approximate in “policy space”; e.g., choose y ~ x 
continually, we have for f±(x) the functional equation 


( 12 ) 


fi(%) = Min k(y 
v^x L 


k(y - x) + a 

Jti 


(13) 


/i(x) = T(x, x, fi), 


and then / 2 (x) defined by 

fi(x) = Min Tiy, x, /i) 

V'Zzx 

Clearly f 2 (x) ^ fi(x) for all x ^ 0 and thus we obtain monotone decreasing 
convergence. 


4. A Simple Formal Observation 

In this section we wish to present the fundamental formal analytic property 
of functional equations of the form 

(1) u(x) = Min v(x, y), y € R(x .), 

V 

upon which all our subsequent work depends. 

In general, the variation will be over some region, R(x), dependent upon x. 
Let us assume that the minimum is attained inside the region, and that v is 
differentiable. Then at the minimizing value of y we have 

(2) 0 = Vy 

This determines a function y(x), which need not be single-valued; however, let 
us assume we may select one such value y{x) for each x so that the resulting func¬ 
tion is differentiable. 

Then, for this function y we have 

(3) u(x) = v[x, y{x)] 
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The crucial observation is now that 

u '(x) = v x + Vy dy/dx = v x , 

since v y = 0 by (2). 

Similarly, if 

(5) u(x l , xi) = [Min v(x 1 ,x,, yi , y 2 )], ( Vl , y 2 ) e R ( Xi; a . 2 ) j 
and we assume that the minimum is attained on the insi de we have 

( 6 ) 


tlx* — V. 


>x x , 


u x „ = v x 


at the minimizing points. 

Let us now apply these remarks to the functional equation of (2.2.7) under 
tte acsumpfon that i( 2 ) - fa, a linear (unction of a and that X) J'pTZ 


(7) 


f(x) Mm j ~hj -kx+ a j p(s - y) 4 ,( s ) ds + af( 0) C <t>( s ) ds 

v Jy 

+ ® [ f(y — s)(jj(s ) ds 

JQ 


If the mimmum is attained at a point y > x, we have at this point 

(8) k ~ a pf y *(*>**+* l" f'<V - sMa) ds = 0, 

an equation independent of xl 
Furthermore, for this value of y = y( x ), we have 


(9) 


f(x) = — k. 


the Llutiorof all th ki 7 COmbmed and ^ nter P re ted, furnish the clues to 

tail in S6 Tnd if S7 n^-r 6 C ° nSidei ' We sba11 discuss them in more de- 

1 in §6, and m §7 we shall utilize the multi-dimensional analogues. 

Part II—Constant Stock Level 

5. Preliminaries 

In this part of the paper we shall consider several processes characterized hv 

trative cost in L P ?f ° Ut “ ? e followin g P ar *, the addition of an adminis- 
T ’ C ^ anges na ^ure of the optimal policy completely 

d>(s) for+r ° b \ he C ® mpIete solution > for an arbitrary distribution function 

b TvwJT?^ T is also directIy pr ^ orti - ai £i 

ordered. In §7 we extend this result to the multi-dimensional case, and show 
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that the solution for the case where there are many items subject to a joint dis¬ 
tribution of demand possesses the very important property of sub-optimality. 8 

Turning from the consideration of these processes involving unbounded time 
intervals, we consider the finite process described by (2.1.5) and show that 
again the assumption of direct proportionality entails a principle of constant 
stock level at each stage. 

We now enter territory which is much rougher when we consider the case 
where the penality cost includes a “red-tape” term which is independent of the 
amount ordered. The form of the solution now seems to depend upon the form 
of the demand function. Several important classes of distribution functions fall 
within the categories we can handle precisely. 

Finally we indicate the form of the general solution without, however, being 
able to make any use of it. 


6. Proportional Cost—One -Dimensional Case 

In this section we present the solution of the case where both cost functions 
are directly proportional to the amounts ordered. 

Theorem 2. Consider the equation 


( 1 ) 


f(x) = Min k(y — x) + a f p(s — y)<f>(s) ds + af( 0 ) f <f>($) ds 

v'zx L ^1/ J y 


+ a [ f(y — s)<f>(s) ds 
jo 


where we impose the conditions 


( 2 ) 


(a) k and p are positive constants , 

(b) <f>(s) > 0, [ <t>(s) ds = 1, [ s<f>(s) ds < <x>, 

JO JQ 

(c) 0 < a < 1, 

(d) ap > fc. 


Let x he the unique root of 


(3) 


k = ap f <f>(s) 


ds + ak [ <p(s) ds . 
Jo 


Then the optimal policy has the form 


(a) for 0 g x ^ x, y = x, 

(4) 

(b) for x ^ x, y = x. 

In other words , the optimal stock level is x. 


3 By “sub-optimality” we mean here that the optimal stock level for any item can be 
assigned independently of the levels assigned to the other items. 

4 It was pointed out by the referee that equation (3) has the following simple interpre¬ 
tation. The run-out probability must be set at the level where the marginal cost for holding 
inventory will be balanced against the marginal penalty for run-out. 
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If ap g k, the solution is given by y = x for x £ 0; i.e., never order. 

In order to understand the genesis of this solution, let us proceed heuristically 
If we obtain a plausible solution and then verify directly that it satisfies the equa¬ 
tion in (1) above, the uniqueness theorem tells us that it is the solution. 

As pointed out in §4, if the minimum occurs at y > x, the minimizing values 
of y must be roots of the equation 

(5) k + a|T —V 4>(s) ds -f jf f'(y — s)<f>(s) dsj = 0, 
where 

(6) f(x) = —k. 

Now let us pull ourselves up by our bootstraps! If the solution has the con¬ 
jectured form, the complicated term, fof'iy — s)^>(s) ds, may be replaced by 
the simpler term — kfl <j>(s) ds so that the equation in (5) may be replaced by 
the simpler equation J 

( 7 ) k - ap <j>(s ) ds — ak <f>(s ) ds » 0, 

which determines y, without involving f'(x), as yet unknown. 

Since ft <t>(s) ds = 1, this equation reduces to 

(8) <t>(s) ds = (ap — k)/a(p — k), 


an equation which possesses exactly one root under the assumption that 4>(s) > 0. 

Having determined £ as the root of (8), we proceed to determine f(x) as fol¬ 
lows. 

For 0 2jj x £ we have 


f(x) = k(£ — x) + a 


(9) 


and f'(x) = -k, or, 

( 10 ) 


j p(s - £)<t>(s) ds + /(0) r 4(s) ds 

. * *4 

+ jf f(& — s)4>(s) dsj, 


/(*) = /(0) — kx. 


* n an< ^ x ■* 0, we obtain the following result for 

/( 0 ), 

r r°° /*§ 

(11) /(0) = + pa J* (s - £)4>(s) ds - ak J (£ - #)</>($) ds 

To determine f(x) for x £ we employ the equation 
‘ Note that the value of X given in (8) is the value of X which minimi^ /(0), 


/(I - a) 
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(12) fix) = a 
which we write 

(13) 


J pis — x)<t>(s) ds + /(0) J 4>(s) ds + jf f(x — s)<t>(s ) dsj, 

/( x) = uix) + a f f(x — s)<j>(s ) ds, 

Jo 


where u(x ) is a known function of x. This, in turn, we write 

a X—X /** 

(14) f(x) — uix) + a fix — s)<j>is) ds + a I fix — s)<f>is ) ds. 

JO J X-X 

In the interval [x — x, x], fix — s ) is known, hence we may write, combining the 
uix) term and the second integral 

A X—X 

(15) fix) = vix) + a fix — s)<t>is) ds, x ^ x 

Jo 

If we now set x — x — z and f(x + z) = g(z), we see that g(z) satisfies the equa¬ 
tion 

(16) g(z) = v(x + z) + a f g(z — s)<f>(s) ds, z ^ 0, 

Jo 

a simple renewal equation which can be solved by iteration if we wish. 

Actually, it is much simpler to differentiate (12) first and then proceed as 
above. It seems to be a general characteristic of functional equations in the 
theory of dynamic programming that the derivatives satisfy simpler equations, 
and are the more basic quantities. 

Let us turn now to a proof that the conjectured solution is actually a solution. 
Call the bounded function obtained above F(x) and the constant in (11), C. 
Then F{x) is completely determined by the following equations 

(a) F(x) = C — kx, 0 ^ x S x 

(17) (b) F(x) = a £ J p(s — x)<f>(s) ds + F( 0) J <£(s) ds 

+ f F(x — s)<Ks) ds , x x, 
Jo J 

Let us begin by showing that F(y) + ky is non-decreasing for y ^ x. We 
have, using the expression for F(x) given in (17b), for x > x 
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In the interval [x - x, x], we have 0 g x - s ^ x, and hence F'{x - s) = -k. 
Thus, for x ^ x, 

(19) F'{x) = —ap / 4>(s) ds-ka <j>(s) ds + a F'(x - s)d>(s) ds, 

or 


( 20 ) 


r r r x 

F'(x) + k = fc ~ ap / <j>(s) ds — ak <f>(s) ds 

J x Jq _ 


/ x—£ 

[F'(x — s) + k] <f>(s) 


dS m 


The expression u(x) = k — apfZ <j>(s ) ds — akf% <j>(s) ds is zero at x = x and 
increasing thereafter. Setting x - x = z and F'(x + z) + k = g{z), we see that 
g(z) satisfies the equation 


( 21 ) 


g(z) = u(x + z) + J g(z - s)4>(s) ds, 


z^O. 


Hence g(z) is positive for z > 0, as we see from the Neumann solution. 
Let us now show that F(x) satisfies the equation 


( 22 ) 


F(x) = Min 



— x) + a 


U. 


P(s - y)<j>(s) 


+ F( 0) f <f>(s) ds 

Jy 

F(y — s)4>(s)dsj, 



or 


(23) 


F(x) + kx = Min/fcy + a 


/ p(s — v)<l>(s) ds 

+ F(0) f *(s) ds+ r F(y - s)<f>(s ) ds 

Jy JQ 


\ 

/■ 

Now for y ^ x, {• * •} = ky + F(y) by (17(b)), and since this function is 
non-decreasing (23) clearly holds for x x. On the other hand for y g* x, 

{•••}= ky + a p(s - y)*(«) ds + F(0) f“ 4>(s) ds 
^ + l ~ k ( y ~ = ky + a £ J p(s - y)<j>(s) ds 

+ F(0) — k if (y — s)4>(s) daj 

and thus has the derivative 

(25) fe + p<f>(s) ds - k JJ 4>(s) ds J ; 
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moreover, by (3), this derivative vanishes at y = z* Since {•*•}" = (ap — k)<j>(y) 
almost everywhere, and this quantity is non-negative, {•••}' ^ 0 for y ^ x or 
{* • *} is non-increasing on [0, x], Since {• * *} is non-decreasing for y ^ x, the 
minimum in (23) is assumed at y = x for x < x. (17(a)) and (17(b)) yield the 
same value of F{x) so (23) holds for x < x if and only if 

(26) F{x) + kx = kx + F(x), 

which clearly follows from (17(a)). 

In the case ap ^ k, taking x = 0 in (17) yields an F which is easily seen to 
satisfy (23), since, as above, F(y) + ky is non-decreasing. 

This completes the proof. It is interesting to note that the solution for 0 ^ 
x :g x, the most important part of the solution, can be found without reference 
to the form of the solution for x > x. 


7. Proportional Cost—Multi -Dimensioiial Case 

Let us now consider the multi-dimensional version of the problem. Here we 
have N items whose stock levels will be denoted by xi , x 2 , * • * , x n , and whose 
demand (si, s 2 , — , s n ) at any time is subject to a distribution function 

<t>(Si , $2 > * * * s $»)• 

In formulating the functional equation for the function f(si , X 2 , • * • > #0? 
the minimum expected over-all discounted cost, let us, for the sake of simplicity, 
consider only the two-dimensional case. 

The remarkable fact that emerges is that the form of the solution is pre¬ 
cisely the same as if <f>(si , $ 2 , * * * , $«) had the form $ 1 ( 51 ) 02 ( 52 ) * * * 4>n(s n ); i-e«, 
uncorrelated demands. It is this which yields the important sub-optimalization 
of the solution which we discuss below. 

An enumeration of cases yields the following functional equation for f(xi , # 2 ) • 
f(x x , x 2 ) = Min [kxiyx — Xi) + k 2 {y 2 — x 2 ) 

+ a\ ( f [pi(5i — 2 / 1 ) 4- P*(S 2 — V 2 )]<t>(si, 5 2 ) dsx d $2 

LAi J vt 

J MOO pOO 

1 / 4>(si, Si) d$i dsi 

VI Jv2 

+ f f [pita “ Vi ) + jf(0, 2/2 — 5 2 )]<^(5i , s 2 ) dsi ds 2 

J 3/i JO 


+ f 1 f lf(Vx — *1 y 0) + ^ 2(52 - t/ 2 )]<^(si, $ 2 ) dsx ds 2 

Jo Jyt 

+ f fftoi — «i, yt — s 2 )4>(s 1 , Si) dsi 


Let us simplify our notation a bit by setting <j> (si, s 2 ) dsi dsi = dG(si, s 2 ) 
and call the quantity -within the brackets K(yi, y 2 ). We then have 
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dK 

dyi 


( 2 ) 


dK 

dy 2 


"(- CL 


-Pi f(f dG(si , *)) 

J V1 Va2*0 / 


+ l ~ si ’ 0 ) (l. 


dO(s 1 


) S2)^ 

n Vj -i 

^ (2/l ~ Sl,y2-Ss)rfG!(si ’ Ss) J’ 

+ ^ o d<?(si,S2) 


+ r (£/«*■*>) 

|*Vl fV2 nJT -J 

+ | i„ ^-/Pl- S )>? /2 - *)<«?(*, S»)J. 

Furthermore, as above, if j/i > xi, 1/2 > Xi , we can expect that 


(3) 


axx 


= -fci, 


3X2 


—&2. 


Consequently, if we assume that the solution here has the same form as in the 
one-dimensional case, the critical levels z± and x% are given as roots of the equa¬ 
tions 


(4) (i) *■ + • [-» C (L *<*■*») -*■ f (C***■ *>)] - 0 

(W i, + , [-p, £ (/£ «(*,*>) f (/£ «(*, *>)] , o. 

These roots exist and are unique provided we make the same assumptions as 
above, namely ap L > h , ap 2 > h , and dG > 0. 

We see that jgj depends for its determination only upon the unconditional 
distribution 77,-0 dG{si , s 2 ), and similarly to determine £2 we require only 

*/*d(jr{S\ , 52 )* 

This is the important property of suboptimalization mentioned above. 

The verification of the solution follows precisely the same lines as that for the 
one-dimensional case, and hence will be omitted, since the details are, of course 
much more tedious. J 

Let us state our conclusion as 

Theorem 3. Let us impose the following conditions upon the equation (1): 

(a) ki and pi are positive constants 

(b) <t> > 0, J J (frdsidsz = 1, f [ $i$d$id$2 < °° 

J 0 J 0 Jq Jq 

(c) 0 < a < 1, 

(d) api > ki . 


(5) 
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Let Xi be the unique root of 

(6) hi = api J (^j <£($i, S 2 ) ds^j dsi + aki jf (^/ ^ ^ Sl 9 S2 ^ ^ Sl ' 

Then the optimal policy has the form 

(a) for 0 ^ Xi, y% = Xi 

(7) 

(b) for Xi ^ Xi , yi — £» 

Iti other words , the optimal stock level for the i h item is Xi . 

// ap t * ^ fci for any i , we sei = 0. 

It is clear that this form of the solutions extends immediately to the N-dimen- 
sional case. 

8. Finite Time Period 

Let us now consider the corresponding problem for a finite process where we 
do not discount future costs. We now wish to minimize the total expected cost 
over a finite time period. 

We define 

(1) h(x) = expected cost over an N-stage period starting with an initial quan¬ 

tity z and using an optimal W-stage policy. 

Then 

Ux) = Min [" k(y - x) + p f 0 - 2/)4>0) ds 
v>x L J y -J 

(2) /„+i(a:) = Min \k(y — x) + p [ (s — y)<t>(s) ds + /„(0) f <j>(s) ds 

y£x L J y v 

+ f o U(y - s )<t>(s) dsj. 


We wish to prove 

Theorem 4. For each n, the optimal policy has the form 

(a) for x x„, y = x n , 

(3) 

(b) for x ^ x n , y = x, 

where the sequence x„ is monotone increasing. 

Proof: The proof will be inductive. We have, with fi(x) defined as in (2), as 
our critical stock level the solution of 

(4) k = p [ <j>(s) ds, 

Jy 

which if it exists is unique. This value does exist if we assume that p > k, as is 
reasonable to suppose. Call this value Xi . It is clear then that for n 1, the 
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optimal policy is y = x j for x g Xi, y = x for x > x %. When x < £ 1 , we have 
fi(x) = —k, and for x ^ x i, we have 


(5) 


Mx) — p (s — x)<f>(s ) ds, 

•>z 

fi(x) = —p f tj>(s ) ds ^ —fc, 


fi"(x) = p0(®) > 0. 

Hence /i'(®) + k ^ 0 for all x 0. 
Consider the case n = 2. We have 


( 6 ) 


Mx) = Min % - a:) + V [ (* - 2/)tf>(s) ds + /i(0) f <t>(s) ds 

U*ZZ L •'y Jy 

+ ^ My - s)<l>(s) dsj. 

The critical value of y is attained by setting the partial derivative with re¬ 
spect to y equal to zero, 


(7) 


* = V f 4>(s) ds — f /i '(y — s)4>(s) ds = Fi(y). 

Jy J 0 


The absolutely continuous function Fi(y) has the derivative 

(8) Fi(y) — —p<j>{y) — fi(0)<f>(y) — f f"{y — s)<j>{s) ds. 

Jo 
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Since/" > 0 ,p + /i'(0) > k + /i'(0) = 0, we see that F^y) is monotone de¬ 
creasing, and there can be at most one root of (7). However, F i(0) = p > k, 
Fi(<x>) = 0. Hence there is precisely one root. Call this root x %. 

The policy is then 


(9) 


y = x 2 , 0 S x ^ x 2 , 

y = x, x 2 S x. 


The geometric picture is illuminating. Write (6) in the form 
(10) / 2 (x) + kx = Min v(y), 

y^zx 

where v(y) is a known function. From what we have demonstrated above, v(y) 
can be shown by graph 1. 

The function f 2 (x) + kx is obtained by drawing the tangent to v(y) at y == 
X2 and continuing it to the left until it hits the v-axis. The function f*(x) + kx is 
now constant for 0 S x ^ x 2 and equal to v(x) for x ^ x 2 . 

It remains to show that x 2 > xi . The quantity xi is determined by equation 
(4), while x 2 is determined by (7). Since —fi ^ 0, it follows that the curve 


rv 

(11) w = g 2 {y) = p I ds — I fi(y — $)<£($) ds 

Jy ^0 

always hes above the curve 


( 12 ) 


w = gi(y) =pi 4>($) ds, 

Jy 


for y > 0 



From this it is clear that x 2 > Xi . 

In order to continue this proof inductively, we must show that 


(13) 

-U(x) -//(*). 

We have 

-/i'(») = k, 0 ^ x g xi 

(14) 

—fi(x) = p f 4>(s) ds, x 

J% 
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and 


—f*(x) = k y 0 S x S x 2 

(1.5) r° r x 

= P j <t>W ds ~ J f x '(x — s)<f>(s) ds, x ^ x 2 . 

In the intervals [0, £j and [x 2 , <*>], the inequality is clear. In [xi, x 2 ], the 
inequality follows from the monotonicity k — pfx ds, which is zero at 
x = Xi. 

Finally, we wish to demonstrate the convexity of fi(x). This is clearly true in 
[0, x 2 ]. In [x 2 , oo ], we have, using (15), 

(16) fa”(z) = + fi(0)<i>(x) + f fi l (x — s)<t>(s) ds 

Jq 

Since//(O) + p > 0, /i" ^ 0; i.e., we have f 2 "{x) > 0, and since/ 2 ' is continu¬ 
ous, / 2 is convex. We now have all the ingredients of an inductive proof. 


9. Finite Time—Multi -Dimensional Case 

The hardy reader may verify that the solution in the multi-dimensional case 
has precisely the same general character. 


10. Non -Proportional Penalty Cost—Red Tape 

As soon as we consider the case where the penalty cost is not directly propor¬ 
tional to the excess of demand over supply, we seem to encounter difficulties, 
and it appears that the simple and elegant solution obtained for the case of pro¬ 
portional cost is no longer valid generally. 

There are, however, a number of interesting cases in which we still obtain a 
solution involving constant stock level. The most interesting of these occur 
when we take the cost of ordering ($ — y) to be p(s — y) + q, where q is a fixed 
administrative cost which appears whenever an excess demand occurs, regard¬ 
less of the amount of the demand. 

In the following part we shall discuss in some detail the case where the initial 
ordering cost has the same properties. 

Let us then consider the equation 


( 1 ) 


/(*) = Mui | ~k(y - x) + a £ jf [p(s - y) + q]<p(s) ds + /(0) J <f>(s ) ds 

+ j o f(y ~ s)<t>(s ) dsJJ, 


distinguished from the equation we have considered above by the additional 
term aqfy <t>(s) ds. It is surprising how much complication this innocuous ap¬ 
pearing expression would seem to introduce. 

We shall, to begin with, proceed formally on the assumption that there is a 
constant stock level solution. The critical level is then determined by the solu¬ 
tion of 
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( 2 ) 


0 = k + a p J <j>(s) ds — qd>{y) + ^ f(y — s)<j>(s ) cfej, 


and we have f'(x) = —k when y > x. 

It follows then that x will be a root of 


(3) 


0 = k + a —p j ds — q<t>(y) — k J 4>(s) . 


Unfortunately, it is not true that this equation has a unique root for all dis¬ 
tribution functions <£($). This equation may be written in the form 


(4) 


(1 — a)k = a(p — k) <t>(s) ds + aq<l>(y). 


If ^ 0, it is true that there is at most one root. 

If we assume that this equation has a unique root, the proof is almost exactly 
as before. There is, however, a more general result where the optimal policy is 
that of constant stock level which we shall now discuss. 

Let us prove 

Theorem 5. Under the above assumptions upon a, k, p, q and <£(s), an d addi¬ 
tional assumption that the last minimum of 


(5) 


“ i* 00 rV “ 

i(y) = ky + a / [p(s — y) + q]<t>(s) ds — k (y — s)<t>(s ) ds 

[__ Jy _ 


is the absolute minimum in 0 ^ y °°, the optimal policy in (1) is given by the 
rule 


(7) (a) 


y = x , for 0 ^ x ^ x, 
y = x, for x ^ x, 


where x is the value of y where the absolicte minimum is attained . 

Proof: Let x be the value of y which yields the last minimum, and the ab¬ 
solute minimum in the interval [0, <»], of the function \p(y) above. Then, pre¬ 
cisely, as in the case where q = 0, we have fix) = /(0) — kx in 0 ^ x ^ x, and 
/(0) is determined by substituting this result in (1), in the range 0 S x ^ x. 
In the interval [x, «>], f{x) is the bracketed term in (1) for y = x. 

The proof that fix) actually satisfies the equation now continues in exactly 
the same way as in the case where q = 0. 


11 . Particular Cases 


Some particular cases where the above conditions are satisfied are 

(a) = e ~ ix - a)i / £V“ id “ 

(b) <j>(x) = be~ bx 


( 1 ) 
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12 . The Form of the General Solution 
Let f(x) be the solution of (10.1), which is to say 

f(x) -f kx = Min F(y), 

where 


F(y) =ky +a p [ (s - x)<f,(s) ds + (/( 0 ) + q) f 4>(s) ds 


Let F(y) have graph 4 


+ [ f(v — s)(p(s) ds . 
Jo 



Graph 4 


Then, the optimal policy has the following form 

(a) 

y = xi, 

0 ^ Z ^ Xy 

(b) 

<4) 

y = x, 

l 

H 

iia 

h 

HA 

$ 

(c) 

y = x 3 , 

X2 < X < X Z 

(d) 

y = x, 

X > x d . 


Part III More Complicated Processes 


13. Unbounded Process—One Period Time Lag 

Let us now state a result for a process of more complicated type. The proof is 
fairly straightforward, but quite detailed, and depends upon the method of suc¬ 
cessive approximations. 

Theorem . Consider the equation 


= Mn [fcz + o [ jf p(s - xMs) ds + f{z) jT <*(«) ds 

+ | /(^ - s + z)<f>(s) dsJJ, 


under the previous assumptions. 
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The optimal policy is given by the rule 


( 2 ) 


z = z(x) for 0 ^ x S 
z = 0 for x ^ x, 


where z(x) ^ 0 and z(x) = 0. The function z(x) is monotone decreasing in x. 


14. Convex Cost Function—Unbounded Process 


As another illustration of the type of result which can be obtained, let us con¬ 
sider the case where the cost of ordering is a convex function of the amount or¬ 
dered. Again applying the method of successive approximations, we can prove 
Theorem. Consider the equation 


( 1 ) 


f(x) = Min g(y — x) + a f p(s — y)<t>(s) ds + /(0) [ <t>(&) ds 
y^x L L J V J V 




where g(y) is a convex , monotone increasing function of y . 
There is a function y(x) and a number x with the properties 

(a) y(x) ^ x , y{x) is monotone decreasing 

(2) (b) y(x ) > x , for x ^ x, y(x) = x, x ^ x. 

(c) x > 0 if ap > g'(0 ). 

This function y(x) determines the optimal policy for (1). 
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DYNAMIC INVENTORY POLICY WITH VARYING 
STOCHASTIC DEMANDS*! 


SAMUEL KARLIN 
Stanford University 

A dynamic inventory model is formulated in which the demand distribu¬ 
tions may change from period to period. The optimal policy at each stage is 
characterized by a single critical number which also could vary in successive 
periods. The dependence of the critical numbers as a function of stochastic 
ordering amongst distributions is developed under various conditions. Most 
of the studies are conducted under the assumption of linear purchasing cost. 
In section 3 the possibility of convex purchasing cost is allowed. 


1. Introduction 

In this paper we shall consider an extended version of the classical Arrow- 
Harris-Marschak dynamic inventory model, with emphasis on the varying 
nature of the demand distributions. For a detailed discussion of this model see 
Chapters 8, 9, and 10 of [2]. A historical account of the general inventory problem 
may be found in Chapters 1 and 2 of [2], 

Throughout this paper we restrict our attention to the case of a sin gle com¬ 
modity. A sequence of ordering decisions is to be made at the beginning of a 
number of periods of equal duration. These decisions result in the building up 
of inventories. On the other hand, stock is depleted by consumption (demand) 
m each period. The demand in each period is assumed to be an observation of 
a random variable with a known distribution function. These random variables 
are postulated to be independent, but not necessarily identically distributed. We 
also assume that all distribution functions possess continuous densities, and 
that such distributions as occur belong to non-negative random variables.’(The 
assumption of continuous densities is not an essential restriction, but helps to 
avoid a tedious consideration of cases. Most of the results developed in this 

paper remain valid for discrete distributions and to a large extent for the general 
distribution.) 

Several costs are incurred during each period. In general we recognize three 
types of costs: a purchase or ordering cost c(z), where z is the amount purchased: 
a holding cost h(-), associated with the cumulative excess of supply over de- 


* Received November 1959. 

11 would like to express my thanks to D. L. Iglehart for his help in writing this paper. 
I also acknowledge partial support from the Office of Naval Research. 
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mand, which is charged at the end of the period; and a shortage or penalty cost 
p(-), associated with the cumulative excess of demand over supply, which is 
also charged at the end of the period. 1 We shall also consider a revenue factor, 
which, of course, should be regarded as a negative cost, and which we shall 
assume to be linear. Finally, we assume that the cost functions are sufficiently 
smooth so that all integrals involving these functions exist and that subsequent 
operations on these integrals are fully justified. 

Excess demand in each period is usually handled in one of two ways: either 
it is considered lost sales, in which case the stock level at the start of the next 
period has the value zero (the non-backlog model), or it is backlogged and 
satisfied by subsequent deliveries of stock ordered at the earliest opportunity 
(the backlog model). We shall deal with both models. For a discussion of the 
nature of the optimal policy in both cases, see [2], Chapter 10. 

Most studies of dynamic inventory models are concerned with determining 
the characteristics of the optimal policy, i.e., the policy that minimizes the total 
expected costs, where costs in the future periods are properly discounted. If the 
cost functions of the model are suitably convex (this includes the case of linear 
costs and others), and if the demands that arise in successive periods are in¬ 
dependent and identically distributed random variables with known distribution 
functions, then it is clear (see [2], Chapter 9; and [10]) that the optimal policy 
in each period is characterized by a single critical number, or at most two such 
numbers, in the following manner: There exist two values s and S ^ s such 
that in the event the stock level (including stock on hand and stock ordered) 
falls below s, the ordering rule calls for replenishing stock to the level S ; when 
the stock level exceeds s, no ordering is done. There is usually a delay (lag) in 
the delivery of ordered goods. 

If the purchase cost is a linear function of the quantity ordered, the optimal 
policy is characterized by a single critical number: i.e., we have s = S in the 
policy described above. Moreover, when the demand has the same density <p( •) 
in consecutive periods and delivery is instantaneous, the critical value of the 
optimal policy can be calculated as the unique positive solution of 

(A) c + J [h'(x — (=) — ac]^(£) d£ — f [p'(£ — x) + r]<p (£) d% = 0 
in the non-backlog case, or of 

(B) c( 1 - a) + j £h'(x - {)*($) d£ - ^ b/(S - *) + rk({) d$ = 0 

in the backlog case. Here, r is the marginal revenue cost and a denotes the ef¬ 
fective discount rate. The precise conditions under which these assertions hold 
are given in [2], Chapter 9. The first result of this kind was obtained by Bellman 
[3] (see also [4] and [6]). If there are lags in delivery, then there exist correspond¬ 
ing equations from which we may calculate the critical number (see [2], Chapter 
10 ). 

The validity of formulas (A) and (. B ) is based on two factors. The first is 

1 Other ways of charging costs can be dealt with by similar methods. 
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the assumption that the purchase cost is linear. (In contrast, when the ordering 
cost is composed of a set-up charge in addition to a linear cost proportional to 
the quantity of stock ordered, the optimal policy is an (s, S) policy [12] and 
there is no known way of calculating the critical numbers.) The second is the 
assumption that demand is stationary over time (identically distributed from 
period to period). From a practical point of view, it is important to free ourselves 
from this restriction. 

In this paper we shall assume that the demand constitutes a sequence of in¬ 
dependent random variables over successive periods which are not, in general 
identically distributed. We first prove, under the assumption of linear purchase 
cost (the other cost functions are general convex functions), that the optimal 
pokey again possesses a simple form, i.e., that in each period whether or not to 
place an order is determined by comparing the stock level with a single critical 
number. However, this critical number may vary in successive periods, and 
ordinarily it cannot be explicitly evaluated by solving for the root of a single 
transcendental equation such as (4), or by any other known means. (In this 
connection, we note that an explicit algorithm for calculating the critical num¬ 
bers is available [8] in the special but important case in which the sequence of 
demand densities varies cyclically.) 


The main objective of this paper is to develop qualitative results describing 
the variation of the critical number over time as a function of the demand 
densities in all future periods. Since we are primarily concerned with investi¬ 
gating the functional relationship between the optimal policy and the demand 
distribution, we shall assume, in order to simplify the exposition, that the cost 
functions are the same in all periods. Moreover, unless there is an explicit state¬ 
ment to the contrary, we shall assume that the purchase cost c{z) = c-z is linear 
and that h(-) and p(-) are convex, increasing, continuous, and vanishing at the 
origin. Actually, most of our results remain valid even if the cost functions 
change in successive time periods, provided we continue to assume the linearity 
of the purchase cost and the convexity properties of the other cost functions. 

> ^ 2 , £> 3 , * * * represent the demand densities in periods 1 , 2 , 3 , • • • , and 
let 5 (^ 1 , *> 2 , <p* , • • • ) denote the optimal critical number in the’first period. 
(I hat the optimal policy is characterized by a single number was noted above.) 
From its very definition the optimal critical number in the second period is 

* (w ’ ** ’ ; ‘' k 80 fo ^- In particular, x(<p, *,*,.••) is the optimal critical 
number when the demand density <p is the same in each period. This last number 
can be calculated explicitly from equations (A) and (B). 

Ideally, it would be desirable to make comparisons between x( Vl ,&,■■■) 
an xtti, ), where <pi, <p 2 , • • • and , 'Pi, • • • represent two different 

sequences of demand densities. Usually no such comparison can be made. How¬ 
ever, if the respective distributions are stochastically ordered (defined below) we 
can establish certam relationships between the critical numbers. 

We say that the density v is stochastically smaller than the density p (written 
V <= v) “ $ 0 ) ^ *(a;) for all a: ^ 0 , where 


/ X X 

<p(£) dl; and ^(x) = f d£. 

Jo 
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In particular, this means that demands based on the density £>({) have a larger 
probability of taking on smaller values than those based on the density *Kf)- 
Not all distributions can be ordered in this manner, but there are important 
cases in which this ordering relationship applies. 

Some specific examples of stochastic ordering are as follows: 

(i) One distribution is a translate of the other: i.e., <p a where $(£) = 
<p(% — a) for a > 0. 

(ii) If $0) = F(x) and V(x) = F r {x ), where r > 1, then <p C= 

(hi) If $(x) is the distribution function of the positive random variable X 
and ^r(x) is the distribution function of X + Y, where Y is also a posi¬ 
tive random variable, then <p a \J/. This example includes convolutions 
of positive random variables as a special case. 

One of the key results of this paper asserts that if <fi c: i j/i for all i, then 

x{<Pi , <P2 , * * * ) S * * * ) * 

The result is correct in all situations, regardless of whether the structure of the 
model permits backlogging and lags in delivery. Thus, we can translate the 
stochastic order relationships satisfied by the respective demand distributions 
in successive periods into direct order relationships between the successive 
critical values of the optimal policy (Corollaries 3 and 4). 

A corresponding result is achieved in the case of a convex purchase cost. 

In Section 2, we prove that if the purchase cost is linear and the demand 
distribution varies arbitrarily from period to period, the optimal policy in each 
period is determined by a single critical number. The method of proof follows 
closely the method used in [2], Chapter 9 (see also [4]). What is more important 
here is the proof of the auxiliary statement that the minimum expected dis¬ 
counted cost function is a convex function of stock level. With the characteriza¬ 
tion of the optimal policy known, we then prove the main theorem pertaining 
to comparisons of the successive critical numbers, and with the aid of this basic 
result we develop a series of corollaries that describe x(<pi , <p 2 , * * * ) as a function 
of the time period. 

The corresponding theorems are proved in Section 3 for a convex purchase 
cost. In Section 4, the theory is developed for a model in which backlogging and 
lags in delivery are permitted. In Section 5 we discuss the variation of the 
optimal policy when the demand density has an unknown parameter that must 
be estimated by statistical methods. 

2. Optimal Policy for Linear Purchase Cost 

In this section we prove several theorems that relate the critical number of 
the optimal policy to the nature of future demand distributions. We assume in 
what follows that the demand distributions are suitably stochastically ordered. 

We first characterize the form of the optimal policy under the following 
conditions: 

(i) The purchase cost c(z) is linear [c(z) = c-z]. 

(ii) The holding and shortage costs /*(•) and p(-) are each continuous, 
convex, increasing functions that vanish at the origin. 
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(iii) There is a revenue term proportional to the amount sold, with unit 
revenue factor r. 

(iv) There are no time lags in delivery. 

(v) Demands in successive periods are described by the sequence of under¬ 
lying densities <&(£), <pz(£), -•• , each of which is strictly pos- 

ltive and continuous. 

(vi) There is no backlogging of excess demand. 

The basic relation analyzed in deriving the optimal ordering rule is the asso¬ 
ciated functional equation, which expresses the symmetries and renewal proper- 
ties of the dynamic inventory process. Its explicit form is indicated in equation 
(2) below. For convenience of notation, we set 

/ y ^ 

My - £) - di + jf [p(£ - y) - ry]<p(£) 

This represents the expected combined revenue, shortage costs, and holding 
costs for one penod when y units of stock are available. Note that we explicitly 
exhibit the dependence of L on the demand density *, since the density will 
normally vary from period to period. 

Let fix; ?!, <p 2 , ••• ) denote the discounted expected cost that will be in¬ 
curred during an infinite sequence of time periods if m is the initial stock level 
is the demand density for period i(i = 1, 2, • - • ), and an optimal ordering 
rule is used at each purchasing opportunity. If we assume that excess demand 
cannot be backlogged, we obtain, in the usual fashion, 


( 2 ) 


•••) = nun|c(y - x) + L(y; 9l ) 


+ « [/(0; **, W, • • •) jf Mt) dtt + jTf(y ) w ( f ) , 

where a denotes the discount factor and 0 < a < 1. We shall need to deal with 
the related function 


(3) 


G(y, v i, ?2 ,-•■)= cy + L(y; Vl ) 


+ « [f(0; p 2 ,p 3 , •••) ^ w(€) ^ + j[ f(y ~ I; <P2, <p%, ■■■)<?!($) 

In determining the optimal policy, we assume that the following assumptions 
are satisnecl. 

Assumption I: 1/(0; <p) + c < 0. 

Assumption II: A'(0) + p'(0) + r - a c > 0. 

Both assumptions wiU be satisfied if, for example, the marginal revenue is 
larger than the marginal ordering cost. Assumption I will also be satisfied if 
the expected marginal shortage cost exceeds the marginal ordering cost. Both 
assumptions are weak restrictions; they will be satisfied in nearly every sound 
enterprise IIAssumption II is not satisfied, the optimal policy is never to order 
(i.e., to fill demands by priority shipments as they arise, thereby suffering a 
corresponding penalty charge). Under conditions (i)-(vi) and Assumptions I 
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and II, we shall characterize the optimal ordering rule for the infinite-stage 
model. The method of proof is a familiar one (see also [4] and [2], Chapter 9), 
involving induction on the numbers of periods in the model. 

We now establish two basic properties of the optimal policy which will be 
referred to frequently. 

Theorem 1. If conditions (i)-(vi) and Assumptions I and II are satisfied, 
then (a) the optimal ordering rule is characterized by a single critical number 
x(<pi , (p 2 , • * • ), and (b) f(x; <pi ,<P 2 , * * * ) is a convex function of x. The critical 
number completely determines the solution in the following manner: if 

X < z(<pi 

the optimal policy is to order up to x(<pi, <p 2 if x > z(<pi , (p 2 , * • * ), the 

optimal policy is not to order. 

Proof: The proof proceeds by induction on the number of total periods in 
the inventory program. Specifically, we shall truncate the model to n periods 
and subsequently let n —> oo. For the one-period inventory model, we obtain 

(4) f(z; <pi) = min {c(y - x) + L(y; <pi)} 

and 


(5) G(y, <pi) = cy + L(y; <pi). 

Let x(<pi) be defined as the smallest value of y for which 

G(x;<pi) = min G(y; <pi). 

y^ 0 

By Assumption I, we have (?'(0; <pi) < 0, and since G(y; <pi) °o as y —> oo * 
we infer that 0 < x(<pi) < 00 . Clearly, the value of y that minimizes G(y\ <pi) 
also minimizes c(y — x) + L(y; cpi) with respect to y . If we can show that 
G(y; <pi) is convex, then x(<pi) will be the smallest root of the equation 

(6) G'(y; <pi) = 0. 

Differentiating G(y; <pi) twice with respect to y yields 

(7) Q’{y\ <pi) = c + T h'(y - d£ - [ [p'(£ - y) + r]^({) d{, 

Jo Jo 


( 8 ) 


G"(y;<pi) = fh"(y - f)w(f) # 


+ f P"0t - y)w(«) + [ft'(0) + p'( 0) + rh,(y). 

Jjj 


Since h(-) and p(*) are each continuous, convex, increasing functions [condi¬ 
tion (ii)] and r is non-negative, obviously G(y; <pi) is convex and x(<pi) is the 
smallest root of (6). In view of (4), it follows that where x < x(<pi) the optimal 
policy calls for ordering to the level x(^i), and where x > x(<pi) the optimal 
policy calls for no ordering. Thus we have 



384 


IX—32—STOCHASTIC DECISION MODELS 


(—ex + G&rfo); <s>J 

< Q ) /(*; w) = j 

cx + G(x ; ipi) 

and from differentiation with respect to x we obtain 


x < x(<p t ) 
x > xfa), 


—c -f- G'(x; <pi) 


<«>) A*;*)- ' 

l —c + G'(x; <pi) x > x(<pi). 

Note that /'(*; <Pi) is a continuous function of x, since G'[x( V] )-, Pl ] = 0, and 
that f(x; <pi) is a non-decreasing function of x, since G is convex. Moreover 
_second derivative of/(*; Vl ) exists everywhere except possibly at the point 
x{<pl) ’ at which > however, left- and right-hand bounded derivatives exist 

JLnus 

/"(«; <Pi) ^ 0 except at x = x(<pi), 

which is enough to show that/(»; *) is convex. These considerations prove the 
theorem for the one-period inventory model. 

Assuming now that the theorem has been proved for an (n — 1)-period 
model, we shall show that it holds for an w-period model. For the n-period model 
the discounted expected cost following an optimal policy is 

f(x; <&, •••,*>„) = min<c(?/ - x) + L(y )<Pl ) 
y= x v. 

+ l /(2/ ” £ «> • • •»«.)«*($) dfl}. 

Let 

G(y; ■■■,<?„) = cy + L(y; &,) 


+ a |/( 0 ; <P2, <p2, ■ ■ •, p n ) j <pi{£) d£ 

+ Jo ~ & **’ **’ ’ ‘ > VnhiW dgj. 


Differentiatmg G(y; ■ • • , Vn ) twice with respect to y gives 

G"{y,<t>i,<pi, •••,*>„) 

= W(0) + p'( 0 ) + r + af'(0) <P2 , V3 , • • •, vjfaiy) 


/ V QQ 

h"(y - *)**(*) df + jf p"(g - y) w (g) dS 
+ <* l f"(y - **, to, ■ ■ •, dg. 


(14) 
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It follows from our induction assumption that 


(15) 


f( 0 ; (P 2 , <pz, * • • , <pn) = c. 


In view of Assumption II, it is now clear that G(y; <pi , <P 2 , • * - , <pn) is convex. 
The argument hereafter is identical to the one used for the one-period model. 
It thus follows that the theorem holds for the n-period model. 

By applying standard limiting arguments (see [7] for the details) we can 
show that f(x; <pi , <p 2 , * * * , <p n ) converges to f(x; <p ±, <p 2 , * * * ), and similarly 
that the critical number x(<pi , <p 2 , * * * , <pn) of the truncated model converges 
to x(<pi , <p2 , * • • ), the critical number of the full dynamic model. The proof 
of the theorem is complete. 

If we alter condition (vi) to allow backlogging of excess demand, the asso¬ 
ciated functional equation takes the form 


}{X\ <Pl,<p2, * * * ,<Pn) 


(16) 


= min< c(y — x) + L(y;<p 0 + a f(y - ••• ,?>nW{) d£>. 

V'Z.x i. Jo J 


In this case we can prove Theorem 1 without requiring that Assumption II be 
satisfied. The argument here is identical to that used in the nonbacklog case 
and will not be repeated. 

Assuming now that the successive demands are suitably stochastically ordered 
(see Section 1), we proceed to derive several qualitative results that describe 
the variation of the critical number. We assume again that conditions (i)-(vi) 
and Assumptions I and II are satisfied. 

Theorem 2. If we are given two sequences of demand densities <pi , <p 2 , * • * 
and fa , , * * • , and if <pi C yf/i for i = 1, 2, • • • , then 

(a) x(<pi , <P 2 , * * • ) ^ x(fa ,&,•••) 
and 

(b) f(x\ (pi , <P 2 , * * • ) ^ f(x; ypi , , * * * ) for all x. 

Proof : The proof proceeds by induction on the number of periods; we first 
prove the results corresponding to (a) and (b) for an n period model and then 
let n —> oo. For the one-period model we consider demand densities <pi and \pi, 
where <pi C ft • Integrating (7) by parts, we obtain for the first integral 

(17) f h'(y - £)*>i(£) d* = h'(0)My) + f h"(y - *)$i(£) 

Jo Jo 

where 

$i(£) = [ <pi(v) dr, 

Jo 

and for the second integral 

- f lp'(£ - y) + rWi) d$ 

J y 

= tp'(O) + r]«,(y) - p'(0) - r - f p"($ ~ v)U - *i(£)l df- 

J v 


(18) 
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Combining (17) and (18), we may write (7) as 


(19) 


G'(r, ) = c + [h'( 0) + p'( 0 ) + r ]^(y) - p >( o) _ 


/ V ^ 

_ *'(» - «)*/£) H - jl p“(f - j([l - #,(£)] di. 

bA!“!wS rassion is obtained tor g ' (k m ^ ^ *■«> cx») 

*i(£) = f ypii-n) dr,. 

Jo 

Since w <= ft , it follows by definition that ft(£) ^ ¥ x ($) f or all £ Z 0; hence 
G'(y\9i) ~ G'(y; ft) for all y ^ 0. 

But Theorem 1 tells us that «(«*) is the smallest root of the equation 

C'(y; #>i) = 0. 

Hence it follows immediately from (20) that 
(21) *(vi) ^ x(ft). 

It is clear from Theorem 1 that the optimal policies for * and ft are each char 

STSh f 101 :^ CntlCal nUI ? er ’ Conse 9 uentl y> if we compare (20) and 
(21) with (10) and the corresponding equation for ft , 


-c 


-c + G'(x; ft) 


< 22 ) /'(a:; ft) = 

it is clear that 

^ 23 ^ rOc; = /'(s; ft) for all x ^ 0. 

Thus we have proved the theorem for the one-period model. 
Assume now that we have proved 

z(<Pl , <Pi , ■ • • , Vn-l) £ x(ft , ft , - • . , ft_j) 


X < x(ft) 

* > z(ft), 


and 


/'(a:; v*-i) £ /'(as; ft , ft , ■ • • , ft_ x ) f or all x ^ 0, 

for any n - 1 pairs of demand densities Vi and ft satisfying ». c ft Differ 
entiatmg (13) with respect to y yields g *'' ^ 

G'(y\ V1,V2, ,Vn) = C + L'(y ; Pl ) 


(24) 


+ a l ■■■,P n )vi{t)d$, 

from which, invoking our induction assumption, we obtain 
G'(r,Pi,Pi, •••,*,) ^ c + L'iym) 

+ a l f(y -$**,*>, •••,ft) w (|) d£. 


(25) 
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Integrating the last integral in (25) by parts produces 

« f f(y ~ bfoyh, • • ■, 4'n)vi(0 d% = af'( 0; V's, * • •, &>)$ i(y) 

(26) Jo y 

+ a I f'(y — ^ 2 , ^ 3 , • • •, &>)$i(?) 

Jo 

Combining (15), (17), (18), (25), and (26), we obtain 

G'(y; <pi, <P 2 , • • *, <p n ) ^ c - p'(0) — r + [ft'(O) + p'(0) + r — ac]f»i(y) 

(27) + I" A " (tf " <** - jf « 

+ a Ff"(y - ft fc, fc, • • •, *,)*i(5) * 

Jo 

If in the right-hand side of (27) we replace $i( •) by ^i( *), the inequality will 
be strengthened. This follows from condition (ii), Assumption II, statement (b) 
of Theorem 1, and the fact that <pi "c fa . When we replace $i(£) by Ski(£), the 
right-hand side of (27) becomes identically equal to G f (y; fa , fa, • • * , fa). 
Therefore, 

(28) Q'(y,<pi,<P*> ,**) ^ G'(y;fa,fa, ••• ,fa) for all y £ 0. 

The same argument used for the one-period model now readily yields 

(29) x(<fi , <P 2 , * * • , (pn) S x(fa ,fa,-“,fa) 
and 

(30) f(x; <pi,V 2 , • * * , (pn) ^ f(x; fa,fa,---,fa) for all x ^ 0. 

This proves the theorem for the 7i-period model. The corresponding results for 
the infinite-stage model may be arrived at by the limiting procedure referred 
to in the proof of Theorem 1. 

If we alter condition (vi) to allow backlogging of excess demand, the con¬ 
clusions of Theorem 2 (like those of Theorem 1) remain valid regardless of 
whether or not Assumption II is satisfied. 

From Theorem 2 we shall now deduce several corollaries that describe the 
variation in the critical number when the demand density distribution varies 
in some definite pattern; for example, Corollary 1 states in effect that the critical 
number in the first period is a monotone-increasing function of the length of 
the inventory program. In the corollaries and lemmas to follow, we assume 
that conditions (i)-(vi) and Assumptions I and II are satisfied. 

Corollary 1. If the demand densities for the first n + 1 periods are given by 
<Pl , <P 2 , * * • J <Pn , <pn +1 , then 

x((pl , <P 2 , * • • , <Pn , 0) ^ x(<pi ,<P2, • * • , <pn , <fn+ 1 ) - 

The zero following <p n on the left-hand side of the inequality is to be interpreted 
as the density of a random variable whose only possible value is 0. 
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) can be 


Theorem 2™° e ^ <Pn+1 ^° r an ^ <f>n+1 ’ cor °Uary follows immediately from 

We next prove a lemma that we shall require in the corollaries to follow. 
Lemma 1 . If the demand densities are given by <pi , <p 2 , ■■■ , and if 

%(<Pl g X(<f>2 

then x((pi ,<&,-■•) = x(<px , <pi, <pi , • • • ), where x(<pi , tp x , Vl , 
exphcitly computed as the unique positive root of the equation 

/ V oo 

[h'(y - |) - ac] Vl (f) <% - f [p'(f - y) + t-WI) df = 0. 

Proof: We have seen in the proof of Theorem 1 that x( Vl , & ... ) i s the 
smallest root of the equation G'(y; n, w, • • • ) =0, where 

(32) G'(y ;«,* 2 ,•••)= c + L'(y; Vl ) + a £ f{ y - *, -.. 

Moreover, by Theorem 1, 

(33) f'(x;<p2,<P3, ••• ) = -c for a; ^ £(<(*, <&, ••• ). 

It is clear from our hypothesis that 

(34) f'(x; <Pi ,<p3 ,■■■) = —c for x g £(^ ,?*,•••). 

If we set y = xfo ,%,•••) in (32), it follows that in the integrand /'(a;: « 2 , V3 
■ ■ • ) is always evaluated at a point * S £(«*,<*,••• )• Hence Sfo ,**...) 
is the smallest root of the equation ' 


(35) 


/ V 

<Pi(£) d£ = 0. 


However, the smallest root of (35) is also x{ Vl by a similar argument 

involving the range of values of f(x; , ^ , • • ■ ) in the equation for 

G'(y; <pi,<pi, ■■■ ) =0. 

Thus £(y>i, <pt , • • ) _ £(^ 5 !, ¥>i, • • • ), and the lemma is proved. 

According to the next corollary, if the demand density increases (stochasti¬ 
cally) in successive periods, the optimal critical number also increases, and the 
critical number can be calculated in each period as if the demand density in 
the future periods were stationary. 

Cordlay2. If * C « C * 3 c • • • , then (a) , * 2 , ... ) g ^ ^ ... j 

and (b) £(«,*,, ... ) = £(^,^, ... ). ' K<Pi>(ps ’ ' 

Lemma 1 Conclus ^ on (a) foUows from Theorem 2 and conclusion (b) from 

Lemma 2. If the demand densities for the future are given by ... 

then x{<pi g x(<pi, <pi, ■ • ■ ) always. y * ’ * ’ ’ 

Proof: We have proved before that f ( ft ,«,•••) is the smaUest root of the 
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equation G'(y; <pi , <p 2 , * * • ) =0, where G'(y; <p x , <p 2 , * * • ) is defined by (32), 
From Theorem 1 we know that 

f — c X < x(<p 2 , <P3 , * * * ) 

(36) f(x; <p 2 , <p *, • • • ) = j 

[—c + G'(x; <p 2 , <pz , * * * ) x > x(<p 2 , <Ps, * * * ), 

and also that x(<p x , <pi, * • * ) is the smallest root of (31). Since 

f(x;<p 2 , <pz , • • * ) ^ —c 
always for all x ^ 0, it follows that 

(37) <?'(y;*i,«*, ••• ) ^ G\y ] c Pl>( p l , ) for all y * 0. 

Hence 

X(<P1 , <P2, ' 9 * ) '=* j <Pl , * * * )* 

Corollary 3. If the demand densities for the future are given by <p x , <p 2 , * * * , 
and if <?2 C , then 

x((Pi ,<P2, - “ ) ^ x(<p2 ,(PZ , • • • ). 

Proof: Suppose the contrary is true, namely that 

x(<pi , <P 2 , * * * ) < x(<p 2 ,**,•••)• 

Then by Lemma 1 

^(^x> ^ 2 , • * * ) = &(<Pl 7 <Pi , ’ * * )• 

By Theorem 2 and the hypothesis, 

x(<pl ^ x(<P 2 , <P2 , * * * ), 

and by Lemma 2 we know that 

x(<P2 y <P2 , * * * ) ^ x(<P 2 ,<fZ, * * * ) . 

Hence 

x((pi , <£>2 , * * * ) ^ x(<p2 ,<pz, • ' * ), 
contradicting our assumption. 

Corollary 4. If x(<pi, ^> 2 , • • * ) > x((p 2 , <pz , * * • ), where , <£> 2 , * * * ? are the 
demand densities for the future and <p x C <p 2 , then 

x(<P 2 ,<(%,'")> %(<PZ > <P4 > ' ' * )• 

Proof: Suppose the contrary is true, namely that 

X(<p2 , <PZ , * ‘ * ) ^ X(<pz ,<pi , * * * ). 

Then by Lemma 1 

X(<P2 ,<PZ, * ‘ ‘ ) = £(<£>2 , <P2 , * * * )> 

and by Theorem 2 

X(<PI ,?>1, * * * ) = , <P2 , • * ‘ )• 
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Applying Lemma 2, we obtain 

X(<Pl ,<P2, • • •) ^ ^ . 


Hence 




*(<Pl,<P2, •••) g Z(>2,%, •••), 


contradicting our hypothesis, 
numbertermTof the^wiation of°J ^ Variation of the critical 

represent it by a curve (FiTC & ° f ^ ^ but ™ Sha11 

th t form ahown “ Kg -!» 

essentially in the manner in^ "^ “ ^ W * ecti " periods 

The following general conclusions are to be drawn frnm fKia 1 • r 

creases,^ the “nlb^ntcUSy dtcrefs7 On Sfl 

jzs d s jr-* ^rss n^r^oi £ 

increaaea while the o£ 

following period is still smaller. ’ en the Cntlcal number in 

bacMog^* of° f e Mess demand 4 is nd 1, and 2 remain valid when the 

modifications^ Pemutted - The proofs re ™ only slight 
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3. Optimal Policy for Convex Purchase Cost 

In this section we characterize the form of the optimal policy when we have a 
strictly convex, increasing purchase cost and a linear shortage cost. The other 
assumptions stipulated in conditions (i)-(vi), p. 9, are assumed to be satisfied. 
We assume in addition that p + r > c'(0), where c(z) is the cost of purchasing 
an amount z. This assumption is necessary to ensure positive ordering when the 
stock level is zero; if it is not satisfied, we simply do not order, or—what amounts 
to the same thing—no inventories are kept on hand. The theorems, corollaries, 
and lemmas in this section are numbered with primes to emphasize their rela¬ 
tion to those of Section 2. 

Theorem 1 '. A. Infinite-Stage Model . 

If the demand densities for the future are given by <p x ,<P 2 , * * • , if the purchase 
cost c(z) is strictly convex, the shortage cost linear with marginal unit penalty 
cost p, and the holding cost convex, and if p + r > c'(0), then there exists a 
number x(<p x , <p 2 , * • * )[0 < x(<p x , (p 2 , * • •) S °°] and a function y(x; <p x , p 2 , 

* * •) such that 

(a) 1 > dy/dx > 0, 

(b) ~f(x;cp •*•) < p + r, 

(c) f'(x;<Pi,<P 2 , ••■) >0, 

(d) y(x; <pi , <p 2 , • • •) > x for x < x(<p x , <p 2 , * * * )■ 

The number x(<p x , <p 2 , * • •) and the function y(x; <p x , <p 2 , * * *) determine the 
optimal policy in the following manner: if x < x(<p x , <p 2 , * * •), the optimal policy 
is to order up to y(x\ <pi , <p 2 ,•••); if x > x(<p x , <p 2 , * * •), the optimal policy 
is not to order. 

B. n-Stage Model . 

For the truncated dynamic model of n periods (n ^ 1), x(<p x , <p 2 , • • • , <p n ) 
and y(x; <p x , <p 2 , • * • , <p n ) express the optimal policy for the first ordering oppor¬ 
tunity in the same sense as above, and these quantities satisfy the properties 

(a) 0 < fo - fo gLigil< i, 

ax 

(b) —/'(*; <pi, <&, •** , <pn- 1 ) < —f'(x;<pi, <p 2 , *** , <p n ) for x < x(<p t , 

<P2, * * * ,<Pn) 9 

(c) -f'(x; <p x , <p 2 , * * • , <Pn) < P + r, 

(d) f"(x; <Pl, <p 2 , ••• ,<Pn) >0, 

(e) y(x; <p x , <p 2 , • * * , <p n ) > y(x; <p x , <p 2 , • • • , <pn- 1 ) for all x < x(<p x , p 2 , 

? j 

(f) X(<pl ,<&,•••)> %(<Pl , <P2 , * * * , <Pn) > X(<pi , <P2 , • * * , <Pn- 1), 

provided x(<p x , <p 2 , - - <p n ) is finite; if x(^>i, ^ 2 , * * * , <pn) is infinite for any n, 
all x(^i, <p 2 , * * # , ^) are infinite. 

Proof: For the case of a stationary demand distribution, this theorem is proved 
in [2], p. 150. If the demand density varies from period to period, the proof is 
the same and will not be repeated here. We point out a slight error in the proof 
given in [2]: property (ii) of part B—property (b) of part B above—is valid 
only for x in the range x < x(<p x , <p 2 , • * • , <p n ), and not for all x > 0. This 
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in C f 2 l iS i^l di u d T 0nstrated hy an Auction argument. The rest of the proof 

onir -"*«■* ae 

J c t“ S n ° W StatCd and Pr ° Ved fOT the cafie of strictl y convex, increasing 

i^z;T~ ieasms s cost ’ w shortage cost ’ and th ' 

Tbeorern 2'. If we axe given two sequences of demand densities w 
and^x, * 2 , ... , and if c for » = 1, 2, • • - , then 


<P2 ; 


(a) y(x; <pi 7 <p 2 , 
(38) 

(b) f( X ; <pi , 9t> 


•) ^ y(x; fa , fa, ■■■) for all x A 0, 


•) ^ f'(x; ipi, fa , 


■) for all x ^ 0. 

modTlSi m f “ b I WuCtion Ule ■™“*» » f Periods- For the one-period 
mo*d consider demands „ and * , where „ c * . The expected cost if we 
follow an optimal policy is 


(39) 


Set 


f(x;<pi) = min<c(y 

IIS* 


*) + l My - €) - rflpi(f) d| 

+ l £p(€ - 2/) -ryJwCf) d£ 


(40) 


G( y , x-,<pi) - c (y-x) + ^ [A(y - $) _ r ?WS) 


+ ^ b>(£ - y) - n/faCt) d$, 

and define 

(41) R<„ - «> - rftn(f) df + £ _ ^ ({) ^ 

A snrnlar expression is obtained for the demand fa . F(y; Vl ) is precisely L(v<n\ 

which is a convex function of ir hence — H'in* ^ ; 0 j ^ 

•o, r QOC! . lrvi ,, . . nence ** \Vi <Pi) is a decreasing function of v 

p ion, c (y — x) is a strictly increasing function of y. Thus 


(42) 


dG(y. x: <pi) 


has »t most one root, call it „(*; for * < *(»), where *(„) is the root of 

(43> C-(0) + Il’ty; n) _ |X 

Sin fr, > jsrsz s jTSr (43) ■“ one -« * —* 

(44) 


H'(y, <pi) ^ F'C?/; ^x) for all y ^ 0. 
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Since both H'iy; <pi) and H'iy; fa) are increasing functions of y, it follows, in 
view of (43), that 

(45) zfa) g x(fa). 

Similarly, we see from (42) that 

(46) y(x; <n) g y(x; fa) for x < x(fa). 

Recall from Theorem V that 

y{x ; <p{) > x for x < xi<Pi) 

(47) 

y(x; <pi) = x for xfa) < x. 

Combining (45), (46) and (47), we obtain assertion (a). 

To obtain (b) we observe that 


■d{yix; <pi) — x] 


x < X(<pi) 


(48) /'(«; <pi) = i 

{H'(x; <pi) xin) < x. 

We consider three cases. For x < Z(<pi) < x(fa), 

(49) f(x;<pi) = -c'[y(x;<pi) - x] S -d\y(x;fa) - x] -fix; fa). 

For x(<pi) < x < x(fa), 


(50) f{x-,<p{) = H'(x;<p!) > —c'(0) > -d\yix; fa) - x] = fix; fa). 


For x(<pi) < x{fa) < x, 

/'(*;«*) = ^ H'tefa) = fix; fa), 

and (b) is verified. 

Assume now that we have proved the following: For any n — 1 pairs of de¬ 
mand densities <pi and satisfying C , y(x; <pi, <P2, , <Pn-\) S y(x; 

fa, fa, • , tu-i) for all x Si 0, and 

fix; <pi , <P2 , • • • , <Pn-i) ^ fix; fa,fa,---, fa-i) for all x ^ 0. 


We now prove the theorem for the n-period model. The functional equation 
for the n-period model becomes 


(51) 


fix; <pi, <p 2 , • • • ,<pn) = min-f dy - x) + Liy,<pf) + a /(0, <&, •• • , 

l L 


<Pn) 


• J v’i(f) d| + fiy — ^;<P2, ■ ■ ■ ,<pn)<pii£) • 

As in (40) and (41), we define 

Giy, x;<pi,<p2, ■■■ ,<pn) = dy — x) + Liy; <pi) 

+ « [/(0 ;<p 2 ,<pt, • ,<Pn) J viii) d% 

+ fiy — k;<P2,<Pi, ■ ■ ■ ,<»»)«»i(i) dfj 
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H(y, * ,« , - • • *,) = L(y; W ) + a [/(0; to,to,---, Vn ) T to(() d$ 
\02) *- **v 

+ f 0 Kv-b**,to,---,to)to(Z)dt . 
We also define y(x ; to, to, • • • , ?») to be the unique root of 

(53) d @(y, xjto, ,<p n ) _ n . 

dy ® ^ 0r X < , • • • ,<Pn), 

where x(<pi, to, , <p n ) is the unique root of 

(54) C'( 0 )+H' (x; to, to,--, to) = 0 . 

The existence of these roots is guaranteed by the convexity of c(-) and 

nfm? ’ ** ’ " ' ’ ^ assum P tion that P + r> c'(0); the convexity 

of H(x ^ to,--- , to) follows from Theorem 1', part B, property (d). Using 
our induction assumption, we obtain S 

(55) H'(y; <?,, to, ■■■,*„)* L>(y- 9l ) + [ f(y - f ; ft , ... , ft) w (*) # 

Integrating the integral on the right-hand side of (55) by parts gives 

H'iy, to , to, ■ ■ ■ , <p n ) ^ [A'(0) + V + r + a/'(0; ft, ft , • • • , ft)]ft( y ) 

f*V 

(56) + i h ' ,( -y ~ dt - jf (p + r )[l - *,($)] dt 

+ l ~ & ** > ft > * • • , ft)ft(f) df, 

wehavl l(?) ^ de&ied “ (17) ‘ ReCaIling Tlle orem 1', part B, property (c), 

(57) h'( 0) + P + r + a/'(0;ft,ft,...,^ n ) > 0 . 

Since ^ c ft , we can replace ft(f) by ft({) in (56) and retain the inequality; 

(58) H'(y; to , to , • - - , to) £ H'(y; ft , ft , • • • , ft) f or all y 5: 0. 

Arguing from (53), (54), and (58) as we did from the analogous relations for 
the one-period model, we obtain S relations lor 

y( - x ’ ^ ’ to , - ■ ■ , to) ^ y(x; ft , ft , • • • , ft) 
and 

/'(a; to, to , ■ ■ ■ , to) ^ f'(x; ft , ft , - • • , ft). 

for the model iB ob “ ^ «» ^ 
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From Theorem 2' we proceed to deduce a series of corollaries comparable to 
those for linear ordering costs. In the corollaries and lemmas to follow, our 
assumptions are the same as in Theorem 2'. The proofs entail adaptations of the 
reasoning employed in establishing Corollaries 1-4 of Section 2. We omit the 
formal details. 

Corollary 1': If the demand densities for the first n + 1 periods are given by 
pi ? <P 2 , * * * , <pn+ 1 , then 

y O; <pi , & , • • * , <Pn , 0) ^ y(x ; <pi , <p 2 , * * • , <Pn , <p n + 1 ) for all x ^ 0. 

Corollary 2': If <pi C <p 2 C <p z C • • * , then 

2/(z; <Pi, <P 2 , • * •) S y(x ; ^ 2 , (pz, * * *). 

Corollary 3': If the demand densities are <?i, & , * * * , and if <? 2 C <pi, then 

2 /(x; <pi , <pn , * • •) ^ 2 /(x; for all 2 . 

Corollary 4': If ?/(x; , <p 2 , * • •) ^ y(x; (p 2 , 933 , * * *.) for all x ^ 0, where 

<pi , v? 2 , * * * are the demand densities and <pi a <p %, then 

2 /(x; ^ 2 , ^ 3 , * • *) ^ y(x; (pz , <pa , • * *) for all x ^ 0. 

The remarks made in Section 2 concerning patterns of demand and corre¬ 
sponding patterns of variation in the form of the optimal policy apply also in 
this case. 

4. Optimal Policy for Linear Purchase Cost, Where There Are 

Lags in Delivery 

In this section we characterize the form of the optimal policy when h( •) and 
p( •) are convex increasing, c(z) = c-z, excess demand is backlogged, and 
there are time lags in delivery. We consider a lag of X periods between ordering 
and delivery. Delivery occurs only at the beginning of a period. Let x represent 
the current stock size. Let y x , y 2 , * • • , y\~ 1 represent the outstanding orders, 
where yi is due in at the start of the next period, y^ is to be delivered two periods 
hence, etc. We define z as the amount of stock to be ordered at the start of the 
present period. Finally, let /(x, y x , y 2 , • • * , y\-i ; <pi , <P 2 , * * *) denote the mini¬ 
mum expected loss following an optimal policy, where (x, y x , t/ 2 , * • * , 2/x-i) 
takes into account the current stock level and quantities of goods to be delivered 
during the following X — 1 periods, and <p x , <p 2 , • • * are the demand densities 
starting from the present period. The functional equation in this case becomes 

f(x, yi , 2 / 2 , • • • , 2/x-i; <pi , <P 2 , • • *) = min < c-z + L(z; <pi) 

+ <x jf f(x + yi — {, y 2 , , 2/x-i ,z;<p 2j <pz, , 


(59) 
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where 


( 60 ) 


L(x;<p±) = 


r 

Jo 

f 

Jo 


[h(x - $)]**($) d| + f p{$ 
J x 

pit - *)pi(£) if 


- *)?>i(f) if 


x > 0 
x :£ 0. 


In the remainder of this section we consider only the case of a one-period lag 
smce in the general case the notation becomes quite complex, and the argu¬ 
ments more tedious than the matter warrants. In what follows we assume 


lim p'(x) > -- - c. 

x->aO a 

Double-prime numbering is used to relate the results of this section to those of 
the previous sections. 

Theorem 1" : If h( ■) and p( ■) are convex increasing, if c(z) = c-z. and if 
there is a one-period lag, then the optimal policy is of the form 


z*(x) 


* wi , <P2, 


* \ , <p2 y 


otherwise, 


where x(<pi , <& , * * *) is the unique root of 
( 61 ) c + «/ f'(y-&v>i---)vx(t)ds = 0. 

Moreover, f(x; <p x , <p 2 , • • •) is convex and twice continuously differentiable 
except possibly at £(<p x , & , ■ ■ ■). ’ 

Proof: This proof for stationary demands is given in [2], p. 162. The proof 
for varying demands is the same and will not be repeated here. Equation (iv) 
on p. 163 of [2] should be corrected to read 


—fn(x) — fn-l(x) for X < Xn . 

Theorem 2" : If we are given two sequences of demand densities , ■ ■ . and 

** ’ *■ « V C h for *' = 2. • • ■ , and if the hypotheses of Theorem 1" 

are satisfied, then 


(a) z*(x ; tpi, <&,■■■) ^ z*( x; fa, fa, for all x, 

and 

(b ) f(x; <Pi, <pi, ■■•) ^f'(x; fa, fa, ■■■) f or au 


The proof proceeds along the same lines as the proofs of Theorems 2 and 2', 
and therefore will not be repeated. 

We turn to the corollaries for the case of time lags. 

Corollary V: If the demand densities for the first n + 1 periods are given by 
<Pi, <p n , tp„+ 1 , then 


z*0; <Pi ,<pt. 


■ ,Vn,0) ^ z*(x; <P 1 , <P 2 , ••• ,< Pn , v>n+ ,). 
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Proof: The proof is analogous to that of Corollary 1. 

Lemma 1": If the demand densities are <pi, <pt, ■■ ■ , and if x(<pi , <Pi , •••) ^ 
5 (<p 2 , <p 3 , ■■■), then x(<pi ,<&,-••) may be computed as the root of 

(62) c[l — a] + a f L'(w — £; <«>s)<oi(|) d£ = 0. 

Jo 

Proof: By Theorem 1" x(<pi ,#*,•••) is the smallest root of 


(63) c + a f'(w — , <P3, )fPi(l) df = 0. 

Jo 

In view of the form of the optimal policy, we obtain the explicit formula 


(64) 


fix; <P2, <pz , ■ ■ •) 


' — c + L'(x;<P 2 ) x < x{<pi,<pz , • • •) 

< L'{x\<f>i) + a f f'{x — %]<p3 ,<Pi, *' ■)<Pi(0 d$ 
Jo 

X > x(<P 2 ,<03 , • • •)• 


In spelling (63) for its first root, we note that f'(x;<p 2 ,<ps, ■■■) is evaluated 
only for x < x( Vl ,«,•••) £ i(vt , <p 3 ,-••)• But on this range f(x; w , vt, 

...) = _ c + U(x; <pi), which when substituted in (63) gives the desired 

result. 

Corollary 2": If </>i C <p* C <p 3 C • • - , then (a) z*(x ; z ^ 

**,*,,•••) for aU Xi and (b) xfa ,**.••')» computed by (62). 

Proof: Conclusion (a) follows from Theorem 2" and conclusion (b) from 
Lemma 1". 

Lemma 2 ": If the demand densities are <pi , <p %, • • • , then 


(65) f(x; <?!,&,■■■) 2; -c + L'(x;vi) for all s. 

Proof: In view of the form of the optimal policy, we have 

' -c + L’{x-,<pf) x < x(<pi,<f> 2 , •••) 

(66) /'(*;«*,**, ■■■)=< L'(x; 9l ) + «j[ /'(*-&«,«»•• -Mf)# 

X > x(<Pl,<P2, ■ ••)• 


But for x > x(<pi ,<p 2 , • • •) 

f® 

(67) c + al f'(x — t-;p2,w , - • *)<Pi(£) d£ > 0. 

Jo 

It is clear from a comparison of (66) and (67) that (65) obtains. 

Corollary 3 ": If the demand densities are <?i, <P 2 , • • • > and if <p 3 C <p 2 c <pi , 
then 

2 *(x; <pi,<P2, 2 z*(z; <&,<(%,■-■) for aU x - 

(In the case of a X-period lag in delivery, the hypothesis should read <px +2 C 
<0x+i • • ‘ Cft.) 
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Proof: Assume the contrary, i.e., 

z*(x; ••■) < z*( x] <pi,<pz, ••■) for all x. 

Then we have 

^ •••) <x(<P2,<p a , •••), 

and by Lemma y x(<pi, <p 2 , • • •), is the smallest root of 

( 69 ) c[l - a] +a J L'(w - bwdpiit) d£ = 0. 

*'0 

Since (53 < ^ , we deduce that 

^ 79 ^ L'(x;<p 2 ) 5i L'(x-,y>i) for all x. 

Also, by hypothesis, 

( 71 ) $i(x) ^ Mx) for all *^ 0 . 

From (69), (70), (71), and the fact that L(x; <p) is convex, we obtain 

( 72 ) c[l - a] + a I L'(w - f;^)^^) d| ^ 0. 

VO 


Now Lemma 2" states that 

( 7 3) f'(x;<p 3 ,ip,, 

Hence 


f (x\ <p 3 , <p t , • • • ) Si — c + L'(x; <pz) for all 


r*° 

(74) c + a i >**,•••)*»($)#£ 0 . 

B ut (74) implies *(*,, 0 ,, • • •) 5 ^ , 0 ,, ...), contradicting ( 68 ). 

CoroUary 4 : If the demand densities are Vl ,02 , V3 , . -. , and if ft c ft c ft 
and 3(01 ,<&,-■■)> x{<p 2 , p 3 , • • •), then £(02 , 0 *, • • •) ^ *( 0 ,, *> 4 ...) 

(In the case of a X-period lag, piC r c #> 3 is replaced by^c^c^c ••• 
Cl <pA +2 .) 

Te^Tf^T 6 ^ C ° ntoai7l , Le -’ •••) < *(*», w, ••■)• Then, by 

Lemma 1 , 2 ( 02 , 03 , • • •) is the smallest root of 

( 75 ) c[l - a] + a f L'(w - {; 03 ) 02 ( 2 ) d| = 0. 

Vo 

Since we have 

*CD 

(76) c[l - a] + a [ L'(w - 2 ; 02 ) 0 ^ 2 ) d2 ^ 0 

V0 

as in the proof of Corollary 3 ", and by Lemma 2 " 
r 00 

(77) C + a l /'(w-fc**.**, •••)**(*)#£ 0. 
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But (77) implies x(<pi , <pi, ■ ■ ■) > <P 3 , • ■ •), which contradicts the hy¬ 
pothesis. . 

This completes our discussion of optimal policies when there is a time lag m 
delivery. The remarks at the close of Section 2 also apply in the case of time 
lags. 

5. Some Statistical Examples 

In this section we study the behavior of the optimal critical number for the 
dynamic inventory model where the demand distribution is stationary over 
time but is unknown. More exactly, we shall assume that the demand dis¬ 
tribution has a known functional form but involves an unknown parameter w, 
which is assumed initially to be estimable in terms of an a priori distribution. 
The demands that occur in each period provide additional observations on the 
rimrmnrl density. These samples of the demand density are cumulatively used 
to improve our estimate of the unknown parameter, until ultimately our modi¬ 
fications of the a priori distribution of w yield an a posteriori distribution of w. 

The expected costs for any policy are computed by averaging out with respect 
to the distribution function of the parameter w. . 

We confine our attention to distribution functions that admit a single suffi¬ 
cient statistic under repeated independent observations. There are two classes 
of distributions in this category: the exponential family and the so-called range 

family [5]. . . 

We ghn-11 develop several results on stochastic ordermg for the a posteriori 

demand densities that are induced by successive realizations of demand. Scarf 
[13] obtained some of these results for the exponential family; it was his analysis 
that suggested our more unified approach. At the close of the section we sha 
apply these theorems on stochastic orderings to describing the variation of the 
critical number as a function of the observations. 

Consider densities of the exponential family 

(78) *(l !«)<£“ tKw)eMt) dS, = l d *> 

where w is an unknown parameter that is assumed to be estimable in terms of 
an a priori density function f(w). To simplify the notation, we have written 
these densities in continuous form only. There is a discrete version that occurs 
where r(|) d| is replaced by a general regular <r-finite measure. The exponential 
family includes the Gamma distributions in the continuous case and the Poisson 
and negative binomial distributions in the discrete case. 

If we take n independent observations, the joint density is 

n 

<p(£l , fe , • • • , In I w) dgl dfe • • • = n I W ) ' ’ ' d % n 



(79) 
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Let 


£ & = s n ; 


£=*1 


then S n is clearly a sufficient statistic for the parameter w. By Bayes’ rule the 
a posteriori density of w given S n is 

(80) givrTSJdw = fLM eWSn fM dw _ 

I” P n (e)e >s ”f(e) de 

JL—oo 

Hence the density of | given S n is 

I me n g(6 | &) r(|) 

(81) *]»■<«>* 

[ P n (e)e )s *f(0)dd 

J—00 

folfn^n?S 0dUCe the n ati0n 1 Sn) = I *)• We ma y then state the 
following theorem on stochastic ordering. 

Theorem 3: If S' > S, then | S) C ^ n (* | $'), 

Proof: We are required to prove that 

l [*.(€ I 5) - *„(£ | S')] d^O for all y. 

Since &*(£ | S) and iA n (£ | S') are densities, we have 

•« 

(82) l | S) - *„(£ I 50] df = 0. 

To complete the proof, it is sufficient to show that *„(£ I 5) - ^ n (t I 5M chanees 
TcZte m + ^ t0 ~ ^ ^ * tr ~ positive axis fronf (Lto 

**(f I 5) ~ *.($ I S') = r(0 

j[. P n+ 1 (°)e ei e ts M de £ p n+ 1 (d)eV’f( 6 ) de 


(83) 


u „ P n (o)e 93 f(d) de £ p n (6)e es 'f(e) de 
= r(0 | £ e ei _ /3 n+ W*'/(9) j ^ j 

A = £ P n ( 6 ) e>S f( 0 ) de ^ B = [ P n ( 0 )e >3 ’f(e) de. 

JL.00 


where 
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We dig ress for a moment to discuss the concept of variation-diminishing 
transformations, which plays a part in the analysis of (83). The kernel, e 
belongs to a P61ya-type distribution and is therefore variation-diminishing. 
This means that the transformed function 

(84) g(&) = / e%(6) dn(6) (M») > 0] 


cannot change signs more frequently than the function h(6). Furthermore, if 
g( •) have the same number of sign changes, then they change sign in the same 
order or equivalently have the same sign for large values of the arguments. 
(For a detailed discussion of P61ya-type distributions and their properties, we 
refer the reader to [9] and [12].) 

We see immediately that 

0 n+ \d)e es m (T+W'fW _ p n+l (0 )Me 63 ' _ A] 

A B A L B\ 


has a single sign change from -+■ values to — values as 6 traverses its natural 
range. Comparing (83) and (84), we deduce that ^*(£ | S) - S') has 

at most one sign change from + to — values. On the other hand, it is clear from 
(82) that ^ n (£ I S) - iA*({ I S') must change sign at least once. The proof of 
the theorem is thus complete. 

Theorem 4: ^»+i(* | S) < ^ n (* | S). 

The argument is similar to that of the preceding theorem. We show that 
^ n+ i(£ | S) — ^n(£ | S) has one sign change from + to — values as £ increases 
from 0 to oo. To this end, consider 


(85) ^ n+ i(£ | S) - fn(i I S) = r(£) j e 


Mft 


,n+l 


A i L 



where 


Ai = I* (3 n+ \e)e es f(9) dd, 

JLoc 


b 1 = f * p n (e)e* s m de, 
JL.00 


and 


1 

m 


= f «*V($) d%. 

Jo 


Clearly 0(6) = + « for 6 = - *>. Also 0(6) is a decreasing function of 6 on its 
natural domain of definition. Now appealing to the variation-dimimshing 
properties of the P61ya-type kernel e H , we find by comparing (84) and (85) 
that yp n+ i(f | S) - MZ I S) changes sign once from + to - values as £ in¬ 
creases from 0 to oo. The result now follows as in the preceding theorem. 

We now prove the analogous theorems for the range family of distributions. 

Consider 
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( 86 ) 


p(i I w) = ff($)r(w)*(|, w ) where ^ w) = 


| ^ w 


here w is again an unknown 
function f(w)(w > 0 ). The 


[0 £ > w; 

parameter estimable in terms of an a priori density 
joint density of an n-tuple of observations is 


(87) 


But 


P(l1 ’ 6 I ' • • <%n = n ViHi I tt)<fc ■..<%„ 

= rn M n «(&) n Mi , w) n <&.. 


i-l i -1 


and thus 



v n = max & 

l£i£n 


defines a sufficient statistic for w. By Bayes’ rule the 
given v n reduces to 


a posteriori density of w 


h(w | v n ) dw = - jfr» ™)r n (w)f(w) dw 

j_ K r K {e)^{v n ,6)f{d) dd' 

Hence the a posteriori density of £ given is 

P(! I ».) d$ = r( 0 )g(£),K£, 0) A (0 | .j ^ 

(88) r 00 

= 2(l)d£ j_j n+ \9)m^,e)Uv„e)de 

/ »- n (^(yn,0)/(«) 

We introduce the notation n(£ I y n ) = ^ /t i 

Theorem 5: It v' > ,, t he„ |„> c ^.Yo. 

Proof: We have to prove that 

jf I I »')J df S 0 for all y. 

Note first that 

^ J I y ) ~ Pn(£ 1i/)] d£ = 0. 
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Thus it is enough to verify that p„(£ [ v) — p n (£ | v') changes sign once from + to 
— values, and the theorem will be proved. To this end, we have 


Pn(tlv) - Pn(?| V) = qm[j + \dm,d)M 


(90) 


Hv, o) 


W, 8) 


f r n m(v,8)f(0)dd [ r n m(v',6)Md0 
Jo Jq 

_ qW £ >m „ _ A w „,,)] 


where 

A = [ r n (e)Hv, *)/(«) dB, B = f r n {6)W, 8)f(d) dd 
Jo J o 

and A > 0, B > 0. Now 


r(6) = --- = —— - >0 for 6 > 0. 

[ qWf(S,e)dt f q(Z)dt 
Jo Jo 

For v ^ 0 < v', we have 

[*(»,*) - = i; 


for v < v f S 0, we have 

jVo, 0 ) — *)J = 1 — 

Since v < v r \ it readily follows that A > B and 

[*(», 8) ~ g W, «)] < 0 

for v < v' S 0. Thus, 

[*(M) -£*(*/ tf)] 

has one sign change from + to — values. But the kernel ^(£, 6) is the density 
of a P61ya-type distribution. Invoking the variation-diminishing properties of 
this kernel, we infer that p n (£ | v) — p n (£ \v') changes sign once from + to — 
values. This completes the proof of the theorem. 

Theorem 6: p n + 1 ( * | v) C p n ( * | v). 

Proof: As in the preceding theorem, we must show that p n +i( * I v) — P»( * | v) 
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changes sign once from + to - values. This is verified by the same kind of 
arguments as before: i.e., we exploit the identity 

JW.tt W - PM I») - „({) {f |- r(9) _ ^ _ 

where 

Al = j r " +1 (0)lK*>> 6)f(d) do and Bi f r n (0)^(v, 0)f(0) do. 

JQ Jo 

We omit the remaining details. 

We now return to our discussion of the inventory problem in which the dis¬ 
tribution of demands has an unknown parameter. Let <p(£ \ S, n) be the a pos¬ 
teriori density of £, where the value of the sufficient statistic is S based on n 
observations and where <p(£ | w) is a density belonging to either the exponential 
or the range family. Furthermore, let C N (x | S, n) denote the expected loss 
for an iV-period inventory model when the initial stock is of size x, the demand 
density for the first period is <p(£ | S, n), and an optimal purchasing policy is 
followed. With this definition, C N (x \ S, n) satisfies 

C^CxlS, n) = nun jc( 2 / - x) + jf h(y - ^(|| S, n) d£ 

( 91 ) + J v - y)<p(i\ S, n) d$ + a C'* _1 (0|,S-£n + l)p(? | 5, n) 

+ [ C "~ l{y ~ 11 S '*’ n + I S > n) , 

where S-£ is to be interpreted as S + £ if ?>(£ | w) is a member of the exponen¬ 
tial family, and as max(<S, £) if | w) is a member of the range f amil y For 
the one-period model in which the value of the sufficient statistic is either S 
or S'(S < S'), it follows from Theorems 2, 2', 3, and 5 that Xi(S) ^ x x (S') 

and yi(x | S) ^ y x (x | S') for all x ^ 0. By virtue of these same theorems we 
also have 

(92) C n (x | S, n) ^ C a) (x | S', n) for all x ^ 0. 

Assume now that we have proved for the (N - l)-period model the relations 
(x | S, n) ^ C ltf ~ iy (x | S', n) for all x ^ 0, 

and 

av_i(&) < x N ^(S') or y N -i{x j S) g y N - X (x | S') for all x ^ 0. 

Employing an induction argument parallel to that of Theorems 2 and 2' we ob¬ 
tain 


x„(S) ^ x N (S') or 
y„(x | S) ^ y N (x | S') for all x ^ 0. 
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The corresponding propositions can be established in the case of time lags. 
Finally, by referring to Theorems 4 and 6 we achieve similar results pertaining 
to the behavior of the critical numbers for the case of n + 1 observations versus 
n observations. 
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1. Scope of the Study 

Scheduling heating oil production is an important management problem. 
It is also a complex one. Weather and demand uncertainties, allocation of pro- 
limi+qH 6 We ^ n erent refineries, joint- and by-product relations, storage 
coTs^dered mamtenanCe ° f minimal su PP Iies aa d many other factors need to be 

st S ^ ape "^ C ,° nCe ™ ed ^ one of an integrated series of operations research 
studies directed toward improvement in such scheduling methods. Emphasis is 

0 7 — mathematical model - Institutional features and other phases 
oi the OR studies are brought in only as required. 

dicTateS? i0n t0f rr n thiS Study phase as the &st of a series of releases was 
frame nf ^f J (it “ h °^ P ro ™ion of a convenient 

° f r f “ f ° r ™ bsequent discussions. The scheduling model played a 

Zst iLDOTtlnt r 686 St 1 Udl , e i S \ It Sh0uld not be inferre d, however, that it wL the 
most important (or valuable) portion. Other OR techniques were equally critical 
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from a methodological standpoint. 2 3 Some were readily available; others had to 
be adapted or developed by cooperating company personnel and study par¬ 
ticipants. Finally, completion and perfection of all phases of the study is at¬ 
tendant on still further such developments, some of which will be indicated in the 
text of this article. 

Substantive considerations were at least as important as OR methodology. 
These were supplied from the accumulated experience and judgment of company 
officials who made their advice systematically available to the study groups. 
By drawing on this source for guidance in formulation, testing and validation, it 
was possible, in an expeditious manner, to develop the desired improvements in 
guides for setting the company’s heating oil schedules. 

2. General Considerations 

Only those portions of the model which are of general interest will be presented 
in this paper. It may be helpful, therefore, to commence with a sketch of the 
problem along with relevant background materials. Production is to be planned 
over a time interval specified in advance. This interval is called the horizon. 
The “objective” 4 is to maximize profits subject to a series of constraining re¬ 
lations over the horizon. Two such series of constraints will be considered: (1) 
“marketing” and (2) “storage.” 5 

The marketing constraints require production to be planned in a way which 
will meet customer demands as they materialize. Minimal inventory levels 
(deemed to be necessary for efficient functioning of the distribution system) are 
also incorporated in them. Another series of constraints refer to storage capacity. 
Schedules should honor maximum permissible inventory levels established 
relative to this capacity. 

Demand for this product is heavily weather dependent. The improved methods 
for forecasting weather components and consequent demands undertaken during 
the course of these studies need only be dealt with formally in this paper. For, as 
was recognized from the outset, only probability forecasts of demands and re¬ 
lated statistical distributions may (at best) be secured. Scheduling models must 
therefore be formulated accordingly and “exact” objectives and constraints re¬ 
placed by their probabilistic counterparts. Thus, the marketing constraints be¬ 
come probability relations with a stipulated high degree of reliability (confidence 
coefficient) for each scheduling period. Similar remarks apply to the storage 
constraints. Also, the exact functional (for the objective) is replaced by its ex¬ 
pected value. 

Maximization of expected profits is, in these terms, to be undertaken over the 


2 See, e.g., G. H. Symonds [22] for a brief sketch of some of the statistical considerations. 

3 Particular acknowledgment should be accorded Messrs. C. W. Foster, J. S. Hull, Jr., 
J. L. Keener, Jr. and S. C. Malloy as members of the coordinating committee. 

4 This term is used in a technical sense. See [7] for further discussion. 

5 Other constraints such as non-negativity conditions on refinery schedules must also be 
observed. 
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entire horizon. A deeper consideration of the constraints, however, suggests that 
the maximizing objective may be replaced by one of minimization. These con¬ 
straints, which are stated in physical terms, are interpreted to mean that de- 
mands are to be taken as given (i.e., stochastically determined). 6 To the in¬ 
dicated (high) degrees of reliability customer wants are to be met as they occur 
due to weather (or related) considerations. The planning objective is thus to 
supply whatever demands may emerge at minimum total cost. 

Programs are to be determined stochastically. As events materialize refinery 
rates for the following period are to be determined in conditional fashion taking 
account of accumulating experience and future possibilities. The me aning and 
quantitative consequences, to be assigned to each such event is to be determined 
beforehand so that plans can be formulated (and evaluated) in advance for each 
possible contingency. Finally, it is desirable to be able to assess potential future 
changes in structure or requirements. 7 


,, A decislon was devised for the purpose of comprehending these elements of 
the problem. By means of detailed statistical studies the relevant densities were 
isolated and tested. In some cases suitable transformations were required, as well 
as means for converting from weather data to demand forecasts. The tests that 
were conducted indicated that statistical independence of the weather dependent 
portions of the demands could safely be assumed. This assumption will therefore 
constitute one portion of what will hereafter be called the “null hypothesis” 
More fully, the null hypothesis, as thus validated, assumes that (a) the sales 
densities are known and (b) the relevant variables are statistically independent. 
Continuing test procedures may then be introduced in a manner analogous to 
usages in statistical quality control, in order to make sure that significant (and 
relevant) changes in the underlying universes do not go undetected. 

By reference to these densities (all of them) suitable weights and adjustment 
factors are calculated for incorporation in the decision rule. 8 First, it is necessary 
to determine hunts for these values relative to the constraints. Within these 
hmits the weights and adjustment factors required for applying the decision rule 
are ^en determined with reference to the cost minimizing objective. 

This part of the calculations requires recourse to suitable approximating 
routines since the relations are, in general, nonlinear. The decision rule is char¬ 
acterized as linear. Confusion will be avoided, however, if it is remembered that 

e rule is linear only after the weights (and adjustment factors) are determined. 
As the random events materialize-in the form of actual sales demands—the 


«.;S 9 L'lL d =,1r.' ‘ n Wh, * h " i '* “ T> *. 

tZSZEZZ? “ '■»— 

! £ hese are r f n y a series of decision rules with the same general form 
By means of suitable controlled “Monte Carlo” calculations. No attempt will be made 
in the present paper to describe these routines in detail. 
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weights and adjustment factors thus calculated (and incorporated in the rule) are 
applied to determine the production schedules in succeeding periods. 10 

The decision rule is applied in a manner which transforms the originally stated 
problem into one which deals with “certainty equivalents”. 11 This kind of trans¬ 
formation represents one way of dealing with probabilistic constraints and 
functionals. It is useful, therefore, to distinguish between this general approach 
and the particular rule employed. The latter may require modification in other 
contexts even when the general approach is applicable. 

Also, for the problem—at least as thus far described—the rule which was 
selected is not necessarily the best one against all possible contenders. 12 Factors 
in addition to those already mentioned entered into the choice of a suitable rule. 
Examination of past procedures indicated that mathematical reformulations 
would yield decision rules which were essentially linear. It was deemed desirable 
to utilize the experience incorporated in these procedures and to select a more 
general rule which would encompass them as special cases. Some of the other 
desiderata were as follows: (1) Computational expediency and flexibility in ap¬ 
plication. (2) A desire to secure assistance for locating “check dates” or sub¬ 
horizons, in reviewing schedules. (3) Some degree of stability in these schedules 
was also deemed to be desirable, even though it was not practically possible to 
state constraining relations in advance, or to adopt an alternative approach and 
impute specific costs for association with fluctuating production beyond certain 
limits. 13 Finally, methods were to be devised which would be amenable to other 
problems not directly involved in setting heating oil schedules. This was im¬ 
portant since these studies were to provide a test for the value of OR work to the 
company. 

After some experimentation a rule was adopted which seemed to give the best 
promise of yielding a combination of the desired results. It is expected that still 
further improvement will be made as experience develops. 14 

3. Cost Functions 

Additional considerations which also entered into model designs may best be 
introduced as the discussion proceeds. 

Carefully conducted cost studies were undertaken at relevant refineries. From 
these results a total company cost function was synthesized in the form 

10 By suitable adaptations it should be possible to extend these procedures for studies 
extending beyond the short-term scheduling horizon. Some of the problems associated with 
such extensions are discussed in the concluding section of this paper. 

11 See footnote 21, infra. 

12 This problem will be dealt with in a separate paper. It should perhaps be noted that 
certain differences in the heating oil problem, including the character of the constraints, 
precluded the possibility of direct access to the results (now classic) in [1], [11], and [12]. 

13 An analysis of such costs for ordinary manufacturing ( not refining) operations may be 
found in [5] and [14]. 

14 Such experience may also bring to light additional qualities which are desired such as a 
rule which ends the scheduling horizon with a relatively low inventory level. 
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~~ Q>j + kjRj 


( 1 ) 


where 


Cj - Cj(Rj) represents average variable unit cost (in $/bbl) at period 
3 = 1, 2, • • • , N. 

Ri = scheduled production (M bbls/day) 

and a,, bj and kj are positive constants, or parameters, which are applicable in 
the j th period. 

A series of (inverse) hyperbolas was thus obtained so that average variable 
costs are convex and increasing functions of R s over the relevant range of pro¬ 
duction rates. 

Because of j oint- and common-cost features it was necessary to estimate costs of 
producing heating oil on the basis of incremental production above standard 
levels. The methods of estimation were designed to insure, so far as possible, that 
variations in demand for related products were supplied in a manner which would 
not distort the estimates of cost for heating oil production. This was done for each 
relevant refinery as a preliminary to synthesizing the total company function 
shown m (1). For this purpose incremental production was assigned to the re¬ 
fineries in a manner designed to achieve lowest total costs for the company at 
each output level. This also had to be done for each period i = 1, 2, •••, iV be¬ 
cause of time variations which were found to be present in the cost functions. 

It would have been possible to render the model in such form as to achieve an 
inter-refinery allocation. This would have caused complications which did not 
appear to be warranted by the benefits that were apparent. 

There are additional cost features which required detailed attention in the 
actual study, although they are not of major interest for this presentation. 
Improved methods for forecasting material and transportation costs, and ascer¬ 
taining proper interest rates and inventory carrying charges, provide representa¬ 
tive specimens. For present purposes these can be regarded as included (at least 
formally) in the expression 


CARi)R 3 - + Kjl j = (c, + Tj)R,. + Kjlj 

(2) where, 

Tj = transportation cost in period j (in S/bbl.) 

Ij = (/,_! + Ij)/2 = average inventory (M bbls.) in period j. 
Kj = inventory carrying cost in ($/bbl. per day) for period j. 


,lWf S 7ff C Tr e : g '’ tr ! nsp0rtati0n costs ~ it was necessary to adjust the model to 
< w for different horizons due to differing contract periods and other such institutional 
ieatures. 
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For simplicity, it is here assumed that produced crudes are transported to 
the refinery at cost T, in period j. They are converted to heating oil and, if 
necessary, stored in inventory for anticipated demands. 

4. Model Details 

It is easier to enter the model through the constraints. First, there is a set of 
sales constraints 

(3.1) Pr j/o “f" 23 Rl == 23 &l /min^ 3 

where a, , j = 1 , 2, • * • , N, is a suitably high “confidence coefficient” prescribed 
for the j th scheduling interval. Second, there is a set of storage constraints, 

(3.2) Pr j/o + 23 Rl = 23 Si + Jmaxj ^ Mi • 

The conditions (3.1) and (3.2) together requires that sales and storage con¬ 
ditions be met with probabilities at least ay and Mi > respectively, where 

To = initial inventory (M bbls) 

R 3 ^ 0 is the production rate (stated in M bbls./day) to be scheduled 
in period j. 

(3.3) Sj ^ 0 is the anticipated sales for period j (stated in M bbls./day). 
/min = a minimum inventory level (in M bbls.) which is to be main¬ 
tained. 

/max = maximum inventory level (M bbls.) set by storage capacity. 
Here Sj is a random variable with known density 

(3.4) //OS/), j = 1, 2, * • * , AT, 

so that the magnitude of sales which will materialize in period j is known only in 
probability. This forms part of what is here called the null hypothesis. 

To simplify the exposition and center attention on main principles, it will be 

assumed that 

(3.5) Mi = h j = 1> 2, • * * , AT. 

Refinery schedules must then never violate 18 the storage limits. Similarly, the 
non-negativity constraint 

(3.6) Pr{«y£0} = 1, i = 1, 2, • * * , AT. 

is also stipulated with probability one. The value of ay in each sales constraint 
may be arbitrary. It is, however, usually prescribed at a high level. 

“More detailed treatment was needed in the actual study to allow for different locations, 
etc. Cf., also, footnote 10, supra. Only the simplest case will be examined here. 

17 See remarks in section 2, supra. 

18 1.e., logically and on the null hypothesis. 
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The functional, to be maximized, may be formulated in terms of expected 
profits—viz., 


max. Etc = max [ T (S, R)f(S) dS 

R J D 

(4) where, 

^ R) = V § biSi - (Cj + 2V)«/ - KiU 
(t(S, R)m dS = f ■ ■■ f tt(S, R)fi(Si) ■ ■ -f N (S N ) dSi • • -dS N 

D * /jD l •'Djv 

and 

Pi = price ($/bbl.) expected to prevail in period j. 

Dj = range of sales variation to be considered in period j. 

R,D = domains of refinery and sales variations to be considered. 

The other terms have been previously defined in conjunction with expressions 
(l)-(3.6). 19 F 

To the measures of reliability stipulated in the constraints, maximization is to 
be undertaken with respect to the anticipated sales developments. Since the 
sales variables and their associated densities lie outside company control, it is 
useful to decompose the profits functional into its receipts and cost components. 
Specifically, the terms involving pjSj are taken as constant with respect to 
variation in the R/ s. The problem is then converted from one of maximizing 
expected profits to one of minimizing expected costs—i.e., 

(5) min. [C/Rm + K, I,] f(S) dS 

IV j—l Jd 

subject to (3.1), (3.2) and the associated non-negativity requirements (3.6). 

5. Decision Rule and Certainty Equivalents 

Numerous methods may be devised for handling problems of this kind. Of the 
variants tried the one which appeared to be most successful involved the use of a 
decision rule molded along the following lines: 20 

19 ^A 7 S -Tn?- e * nstance f ^ * s desirable to allow for varying lengths in each period j = 1,2, 
in a * S may eas ^ y k e done but is here omitted to avoid extra complications. 
Associated with this rule is a method of application (to be described in due course) and 
the implicit stipulation 


2 fikl Si -f- 71 ^ 

1-1 I~l 


k—2 k~l 

2) Pk-ll Si - f- ^ 7i 0 
1-1 


since B } = 0. borne of the values yi are thus allowed to be negative if better performance 
be secured by doing so Because (6) is stated as a cumulant, inventory position in 
period j — 1, 2, * • • , N is also considered implicitly 
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^2 Rl — ^2 fill Si + 22 7/ 

i-1 l=-1 1 =l 

( 6 ) 

fti = 7i ^ 0. 

This rule is linear, but, as has already been noted, only after the fin and yi are 
available. These values are jointly determined via the relevant densities to 
suitable degrees of approximation. 

The weights, and 7 , are determined in advance for the entire horizon. 
Applied to past sales they yield what may be called “certainty equivalents” 
for the production schedules needed to satisfy emerging demands at desired 
levels or reliability. Thus, suppose that Ri , the refinery rate for the first period is 
wanted. The value 71 , known in advance, gives Ri = 71 • After Si has materi¬ 
alized, period 2 production is ascertained from R 2 = fix.Si + 72 22 Continuing in 
this fashion each R,- , j = 1 , 2, • • • , N, is determined in conditional stochastic 
fashion, taking ; account of actual sales and production, forecasts of the future 
(via the weights) and forecast corrections whenever accumulating information 
and experience indicates that adjustments are required. 

Although the weights are calculated by reference to the originally assumed 
(and tested) densities, it does not necessarily follow that they will have to be re¬ 
calculated for each new situation. Only at certain critical points (closely associ¬ 
ated with the “cost subhorizons”) 23 are these weights sensitive to variations in 

21 1.e., the known weights are to be applied to past sales (also known) in order to de¬ 
termine the refinery rates. When these adjustments are effected and the refinery rates set as 
though these (adjusted) “sales” are each to be met with perfect certainty then future 
(actual) sales demand will (on the null hypothesis) be satisfied at prescribed reliability 
levels. Note that the values of 0 and y are determined for the entire horizon. Hence, the 
refinery rate scheduled for the next period applies not only to sales of that period but to 
other periods as well, allowing for inventory carrying costs and other charges and con¬ 
straining relations which need to be considered in the overall optimization. 

For reference to the general problem of certainty equivalents and its relation to price and 
cost regimes which produce equivalent resources allocations under certainty and uncer¬ 
tainty, see [17], [18], [20] and [21]. Notice, however, that the present approach differs from 
these standard'versions in at least the following respects: (1) In place of the usually em¬ 
ployed parameters—e.g., the standard deviation as a measure of risk—the entire densities 
are used here in order to allow for such phenomena as oppositely skewed distributions which 
represent different business situations. (2) The usual exact pairing of points for income sub¬ 
ject to risk and the corresponding certainty equivalents is here replaced by inequalities. 
(3) These inequalities refer to relations rather than the usual number pairings. Notice also 
that the question of existence of certainty equivalents does not arise in the present context 
because the ranges of the sales densities are finite and below capacity limits. 

Confusion will be avoided if the above points are kept in mind while reviewing this 
literature It will then also become apparent that, when viewed in terms of risk analysis, 
still another interpretation is possible for the objective in the present problem. By under¬ 
taking to minimize its cost relative to the constraining relations, the company may be 
viewed as attempting to maximize the surplus relative to the implied risk premiums. 

22 Note that the /fy for the same sales variable are (in general) different for each period. 

23 This term is explained in the immediately following paragraphs and elaborated m 
detail in later sections of this paper. 
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the data. Over other intervals considerable “slack” is (in general) present in the 
system. The stability arising from this slack may be further reenforced by the 
possibility of offsetting variations in the densities along with differences in the 
cost functions which are applicable in different periods. Also, most of the relevant 
statistical tests are one-sided so that it is necessary to consider only those vari¬ 
ations which affect the size of the critical regions in the tails of the resulting 
distributions. 

Two stages of analysis are used to secure the values /3 and y. The decision rule 
(6) is first substituted in the constraints. This yields (among other things) lower 
bounds for ascertaining these components of the sales certainty equivalents. 
Optimization is then undertaken relative to these constraints. For this purpose 
the decision rule (6) is substituted in the functional and values /3 and y are se¬ 
cured which minimize the cost of meeting the (inequality) levels prescribed in 
stage one. 

By this two-stage process the original problem is transformed to one which 
involves minimizing a convex functional subject to linear constraints. As such, 
the problem is amenable to solution by available methods which extend the power 
of linear programming techniques to embrace this class of nonlinear problems. 24 

Because of the need for systematizing the continuing test procedure and be¬ 
cause advance “check dates” were desired for committee review of schedules an 
alternate approach was employed to bring these factors to the fore. The concept 
of cost horizons was introduced and developed for these purposes. This term is 
used to distinguish the approach used here from more standard versions of sub¬ 
horizon analysis. 25 The latter proceed directly from production or sales data as 
follows. Relevant “peaks” in the production requirements are ascertained in a 
way which makes it possible to locate subhorizons intervals and to determine 
schedules within each such interval which minimize the total cost of meeting 
anticipated requirements. These “production horizon” approaches may be con¬ 
trasted with “cost horizons” in which the cost functions themselves provide the 
point of departure. Relevant “peaks” in incremental (or marginal) costs are 
ascertained for demarcating subhorizon dates. Setting these peaks as low as the 
constraints allow while increasing incremental costs for preceding periods as far as 
may be necessary yields a minimum total cost schedule which fulfills the require¬ 
ments at the critical subhorizon dates and (in general) overfulfills them in be¬ 
tween these dates. In principle, therefore, the cost horizon approach determines 
physical schedules by a series of cost inequalities. The physical “check dates” 
are also thereby ascertained. Finally, computational advantages may be secured, 
relative to the burden which would otherwise be incurred in a straightforward 
Jmear programming approach. 

24 Vide [2] and [4]. 

from IhfeSwdi 101 ' ^ ^ h ° WeVei ' ! f ° r an earlier V(!rsion in whirh the approach is also 

- 6 Cf. (3.1) and (3.2) which am stated as double inequalities. 
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6. Procedure for Determining Certainty Equivalents 

The range over which integration is to be taken for the probability integrals 
furnishes a basis for the first stage of analysis in securing the required certainty 
equivalents. When (6) is inserted into (3.1) the sales constraints are re-expressed 
as 

(7.1) Prjlo - /min + £ 71 £ 2 (1 - + S,\ ^ ay 

l 2-1 2-1 J 


for j = 1, 2, • * * , N. Similarly, (3.2) becomes 

(7.2) Pr(/ 0 - I ma * + E Vi ^ 2 (1 - hdSi + <4 = 1 

l 2-1 2-1 J 


for the case in which all /iy = 1. 

This last expression is directly translatable as 

(8.1) S 72 ^ /max — /o + HUH [2 0- — PjdSl + , j = 1, 2, * • * , N. 

i-i s L*“i J 

Its predecessor, the sales contraint, may be translated into analytical expressions 
of the form 


fBjk r B i 1 i-J: A , A 

I " 'I * • * j dSj XI Sjtfki&k) dSk ^ <*j 

00 «L-00£,-,/_ 1 -L—ooej-jfc A-oogj-j fc—1 


(8.2) where 


y-i 


B;* = 


A,(x) -Sj- E 11 - fti | 

Z-Jfc+l 


l - 


and 

j 

Ay( y) ^ Iq /min 2-r 7l 
2-1 

>§* == €jkSk 

£jk = db 1 according to whether ^ l 27 for j ^ 2. 
At j = 1 the condition reduces to 

(8.3) F l [A l (y)\ = / /i(Si) dSi s «1 

J—oo 


or, 

Ai(-y) S FT^ai). I.e., yi ^ 2'T 1 (a:i) + -Imm — 7o • 

To satisfy the period one sales constraint it is therefore necessary to determine a 
value 7 i which satisfies this inequality. It is similarly necessary to secure limits of 
integration which satisfy (8.2) in each period j = 1, 2, - • • , N. 


27 When /3y* ~ 1 the integral involving this term is omitted from the iterated integration. 
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The values of are all close to unity. Indeed, it may be argued that this will be 
typical of most situations by recognizing the business significance of constraints 
such as (3.1) from which (8.2) was derived. 28 The upper limits of the integrals will 
usually have to be quite large to insure these results—i.e., each integral will have 
to include most of the area under its density. Recognition of these factors makes 
it practicable to develop computer routines which require exploration of only 
re atively small regions at the tails of the densities in order to obtain approximate 
solutions of sufficient validity. 

When large numbers of time periods are involved, recourse to high-speed 
computation facilities will normally be required. 29 If, however, the /3 jZ are set 
equal to unity for j > k ^ 1 a considerable simplification is attained. 30 The 
Ca ^rv- a - nS ^ en P rocee( ^ * n a relatively simple and straightforward manner. 

This simplifying assumption will be made here. With all fa = 1 the decision 
rule (6) may be rewritten as 


(9.1) 

23 Ri ~ 23 Si + 23 7; 

z-l 


Ri = 7 X 

or. 

£ 

II 

0? 

(9.2) 

R*2 — iSi -j- 72 


Rj — Sj-i + t j 

with the additional conditions (see footnote 20) that S^i + yj ^ 0. 

The constraints can now be reduced to linear inequalities on the y y . Since 
Pit = 1, (7.1) becomes 


Pr {/« — /min + S 7i ^ ^ OLj 

An equivalent statement is 

h ~ /min + g 7* ^ Fj\ai), 

(11) where 


Fiix) =£/A-)^, j = 1,2, ... ,N. 

It should be noted that FJ\* S ) is a specific number prescribed by aj and the 
frequency distribution of sales m the/" period. Since /„, / mi „ and F~ l («,) are 


28 Cf. section 13. 

9 An approximati 
these calculation 
> Normally these values have been found to be close to unity. 


ta-allKS,“‘ h0d “■ c ‘ lledMo '“* c ‘ ri »*“ >~n 
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known, they provide appropriate stipulations 31 for constraints on the y values 
which are determined for (11). After optimization is undertaken the resulting 
values may be incorporated in ( 6 ). In conjunction with past sales (as they 
materialize) they supply the certainty equivalents by which refinery rates may be 
programmed. 

By a similar development, (7.2) becomes [see (8.1)]: 

(10.2) / max — h + min S s , j = 1 , 2, • • • , N. 

Z*=l 

and the condition Rj ^ 0 becomes 

(9.3) min. £>y -1 + 7 y ^ 0, 

since non-negativity is prescribed with probability one. 

The expected costs in (5), which are to be minimized, i.e., 

( 12 ) it f {[<*(«,) + TjiRj + kja mds, 

iY y— i J d 

reduces to a sum of integrals, each over a specified domain, Di , since, by (9.2) 
Rj = Sj—i + 7 j . Moreover, only $y_i (and hence only yy) is involved in the 
c } —i.e., the cost elements which involve Rj nonlinearly . 32 A separable convex 
function in the yy is thus obtained. 

Minimization of such functionals, subject to linear inequalities, is (as Charnes 
and Lemke [ 2 ] have shown ) 33 amenable to usual procedures employing adjacent 
extreme point techniques in linear programming to any desired degree of ap¬ 
proximation. In this sense the problem may formally be regarded as solved. 

Because of other features—e.g., location of subhorizons—it is desirable, 
however, to explore additional approaches. One such additional approach can be 
developed by utilizing the necessary and sufficient conditions for minimization of 
non-linear convex functionals developed by H. W. Kuhn and A. W. Tucker [16]. 
The resulting analyses and interpretations can be used to simplify and guide the 
programming calculations. They can also be used to throw additional light on 
matters of general theoretical interest. It is this latter aspect which will be given 
primary emphasis in the discussion which follows. 

7. A Method for Scheduling by Cost Horizons 

The problem of minimizing ( 12 ) subject to (11) and (9.3) is a special case of 
the problem of finding fy that 

N 

(13) minimize <3 = ©yCfy) 

y*i 


31 See [7]. 

32 For purposes of this discussion the transportation costs are taken as linear functions of 
the scheduled rates of production with the relevant constants known (with certainty) in 
advance. 

33 See also [4] for further development. 
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subject to 

(14) 

(15) 


} 

£ 


^ t ^ 


£ 0 


i "= 1> * • •, JV 

j « 1 ,...,^. 


To see this result, one simply lets = 7 y + min 5 y _, for i > 1 , 5j ® 7 , 
and defines C y and «r y in the obvious way. 

In this section we develop a procedure for solving (13) (la) under the as¬ 
sumptions that for each j the derivative < 5 j( 7 y ) of G, exists, that c'J(O) ® Qu 

34 This hypothesis is not essential but simplifies the ensuing tliseussnm. 

and that C y (7 y ) is monotone increasing. The mimotonioity of implies 

that its inverse function f y (C y ) is also monotone increasing. A final assumption 
is that 0 < <r l < • • • < <r N . The procedure we are about to describe can be 
generalized to incorporate restrictions like ( 10 . 2 ). However, in order to simplify 
the argument, we will not do this. 

The necessary and sufficient conditions of optimization describt*d by Kuhn 
and Tucker [16] are equivalent, for the present problem, to (14), (15), and 

(a) e$(?,) £ e; +I (f y+1 ) 

(16) 

(b) C y (f y ) = e y+ i(? y+1 ) whenever £ > <r y 

iml 


tor j 


h 


, iv wmm <Vh 


w w . w 

A general characterization of the procedure to lx> employed is ns follow* \t 
each stage a new subhorizon is determined and old oneM possibly. removed. 
An optimum is thereby obtained for a problem which is metre restricted than its 
predecessor but less restricted than the problem which is to be solved in M<> 
In at most N stages the solution to the original problem is obtained, 
io commence the process, determine Co by 

. N 

' ' 2 $i(cq) « <r N , 

S'linrT? forfi f e theu °P timal for tho lews restricted problem in which 

iS “?>•“ th " — »'M - ..-r ,« 

svramts as well an overall optimum is then achieved. 

Suppose that this is not the case. Let 


(18) 


h 


the first j such that £*,(*) < 




9$> 


nw Va ^ Ue, ^ , * S se * ec ^ e< ^ as new subhorfoon corr<**t{Kmtitttif to tlin 

New values e, and d, are now determined hy the reaper, J,. *"* 
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Y 7l (Cl) c h 

m 

Y 7i(c'i) = ~ Vu- 

l-Jl+1 

Evidently c x > c 0 > c'i. The values f; (ci) and 7i (c'i) now furnish an optimum 
for a problem which includes at least one more production condition than its 
predecessor. It is also less restricted than the total problem so that if these new 
values of y i satisfy all of the originally stated constraints the overall optimum is 
achieved. 

It will suffice to carry the indicated procedure only one more stage. Suppose, 
therefore, that one or more of the originally stated constraints are still not 
satisfied. Let 

j 2 be the first j > ji such that 

(20) Z yi (c'i) < ff i - °7i 

i-i 1+1 

Determine c 2 such that 

(21) Y 71 te) = O'is — °Vi * 

l~h+l 

Evidently c 2 > c\ . Now if 

(i) ci ^ c 2 then determine c' 2 by 

N 

(22) Z 7 1 (c'i) = «rjf - <r h 

32+1 

Evidently c'x > c' 2 so that an optimum to a more restricted problem than its 
predecessor is achieved using Ci, c 2 , c 7 2 as the “marginal costs” in the ranges 
indicated. 

If 

(ii) Ci < c 2 

then the horizon at ji is to be eliminated. Determine d 2 such that 

3 2 

(23) Z Ti (ca) = ay, 

V 7 I—1 

Evidently Cj > > ci. Also determine ££ such, that 

Z 7i (c' 2 ) = 


(24) 


CTjv CTy 2 
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i? mtrerestrktPd 2 th ^ CaSe (i) ° r (ii) an °P timum to a problem which 

s more restricted than its predecessor is secured 

oai\StS^ fUrtller the °? y DoveI elOTe "‘ “ «>» extension of 
case ( 11 ) whereby, say, c k is greater than several of its predecessor r’s The «,k 

iTTdZ&r r 'Z* ™ *° ESZXl ££ 

Clearly in at most Yet 6 +k ^ m t de evident b y the foregoing discussion. 
O early m at most N stages the overall optimum is achieved 

of Sf vadables ’ c ’ determined are, of course, appropriate values 

hlwifepd i + e conditions ( 16 ). The inequalities in the text, 

r ^O aremonoto ™ de f y ” are consequences of the fact that tti 
7j j) are monotone increasmg functions. 
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2. Structure of the Model 


A. Demand Distribution 

The S, s policy—with constant values of s and S —can be optimal only when 
the stochastic process by which demand is generated has sufficient stationarity 
properties. In the following, it will be assumed that this process is purely dis¬ 
continuous 2 ( and that after every event the process is again independent of the 
past. 

One way of describing the demand distribution in terms of this process is as 
follows. The distribution of the length of intervals between successive events is 
given. Different intervals are independently and identically distributed. Also 
given is the probability distribution of the quantity demanded at an event. This 
distribution may depend on the length of the time interval preceding it, but it 
is independent of any previous (or subsequent) demands. 

An alternative description is in terms of the probability that a, quantity i is 
demanded during an interval t following an event. The two descriptions will be 
related to each other. 


B . Supply 

Delivery of an order is assumed to require a fixed time, T. No restriction is 
placed on the size or number of orders that may be outstanding at any time. 
There is a fixed cost of ordering, fc, which is incurred when the order is 

made. . 

Per item and per unit item, a carrying cost of h dollars is incurred. Per 
unit shortage and per unit time a penalty cost of g dollars is incurred. Let y 
denote the stock level when positive, the shortage level when negative. Then 


f(y) = 


hy 

-gy 


2/^0 

y < 0 


denotes the storage and shortage cost function. Future costs are discounted by 
a discount rate of d = e~ a per unit time. The unavoidable outlay of the pur¬ 
chasing price per item need not be considered: the effect that the timing of this 
expense has on total cost may be contained in the storage and shortage cost 
function. 


C. Decisions 

Decisions concerning whether to order stock replacements and how much to 
order may be made immediately after a demand has arisen. While this is ob- 

2 For a definition see [3]. Continuous processes may occur when the commodity under 
consideration is infinitely divisible, such as a liquid, and its rate of consumption is a random 
variable. The examination of some stationary continuous processes has shown that the 
S, s policy is always optimal and that the principal formula (8) to be derived below remains 
valid provided the sum is replaced by an integral. Consideration of the more typical dis¬ 
continuous stochastic processes is therefore no real restriction of generality. 
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Viously less general than the case in which decisions may be made at any time, 
This restriction does not seem to be of practical importance. 

D. Translation in Time 

Since delivery requires a time of exactly T, it is natural to associate with the 
present time not the inventory cost which is incurred now but that which is 
incurred T units of time later. 

E. Horizon 

The planning horizon is infinite, i.e., no definite duration of the program is 
envisaged. However, by adjusting the discount rate and carrying cost, one can 
allow for the possibility that with the constant probability y dt, the system 
will be terminated in the next dt units of time, and that the salvage value of the 
inventory (both stock-on-hand and on-order) is a fraction of the unit cost of 
an item, so that a loss of c dollars is incurred upon termination. Of course, c 
may be zero when the stock can be liquidated at its full value. 

F. Object 

The decision objective is to minimize the expected value of present and dis¬ 
counted future avoidable cost. Unavoidable costs are, for instance, those in¬ 
curred between now and time T (i.e., during the delivery time) because no 
decisions we now make can take effect before a time span T has elapsed 
It will be shown that this is equivalent to a first degree of approximation to 
minim izing cost averaged over time. 

This expected cost—or loss function—depends on two “state variables” of 
the system: the inventory on-hand plus on-order, and the time that has elapsed 
since the last demand. However, we shall restrict our consideration of the loss 
function to times immediately after demand. Our first task is to formulate a 
recursive or “dynamic programming” equation for the loss function L(y), i.e., 
for the expected value of discounted avoidable cost at a time immediately after 
a demand, conditional on inventory y and an optimum policy. 

G. Probability Concepts 

The following probabilities are used 

p(i, t) the probability that i units are demanded during an interval of 
length t immediately following a demand 

q(t) dt the probability that the interval between successive demands lies 
between t and t dt 

*i(t) the conditional probability that the quantity demanded is j given 
that the time interval since the last demand is t 
o{n, t) the probability that there will be n occasions of demand during an 
interval of length t following a demand 
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The following identities will be used 

p(0, f) = f sM dr 

t 

p(i, t)=± f WMq{r)r[l - Qit - T)] dr (t > 0), 

n=l JQ 

where [ l 1 "’ denotes the nth convolution of the probability in brackets, de¬ 

fined recursively as 

[«(0fl(01 <B) = t [W* - r)q(t - T)]'"- 1 V ; (r)g(r) dr 

j —0 ‘'O 

Wi(t)q( 01 U) = 


and 


Q(0 = f q( t ) dr 

Jo 


is the probability that a demand occurs in time 1 or less; hence 1 -W 0 
is the probability of no occasion tor demand during an mterval of length < 
following after a demand. 

When iri(i) = t,- is independent of t, this formula reduces t 

co ^ ^ 

p(i, o = X/ p( n > j 

n=0 

where „<«, t) is the probability of » occasions of demand during an interval of 
length t following a demand (see Section 6, B, 3). 

3. The Inventory Equation 

let the stock 

S"rtaT?i^“, but placed F units o, time lat». What is the ea- 
pected value of this cost conditional on jj and f. demanded. 

Suppose first that« < 

p';, ;!" -t Thusthe probability of physical stock being p - « at a tune 

r, T ^ r ^ T + t, is 


± f‘ -3 ,t- *) dt 

y=o Jq 


Discounting and integrating over the interval F S r £ F + b we have the 

expected cost 

£ f T q(t)rj(t) f + p(* ~ J> T - * )e ~“ T dT dt '^ V ~ l) ' 

f^o Jo 
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Second let t ^ T. Then physical stock equals y during T < T < t For anv 

s ' st+T ’^ of f 5t „c- k L™. 7° ts: 


2 

Z Vi(t)p(i - j,T - t). 

Discounting and integrating we have 

L q(t) [f T e “ T ^y ) + l Z rj(t)p(i -j,r- t)e aT drf(y - *)J 

The total cost is therefore 

F(V) = § [I qit) § ^ J r V(i ~ j> t — t)e™ dr dt 

f°° + r T+t -i 

+ Jr 9< ^ S ^ J t ~ hr- t)e~ aT dr dij f(y 

+ I T q(t) l l e ~ aT ~ e ~ a ‘] dtf(y) 

= e~ aT X) brjfiy - j) 
y=o 

where 

6 " l C «<«[' - O dt - /*„(,) fe-" dT 

*'0 Jo 


d< 


0 


( 1 ) 




and 


__ e r rT+t 

r ° = T ! J T p(0, T - t)e~ aT dr dt 

e aT f 00 rT+t 

+ TJr 9(tU(t) J t p(0, r - t)e~*' dr dt 
e aT f“ 1 

+ TJr q(t) a [e ~° T ~ e ~^ dt (2) 

_ e aT r T + rr+t 

r< ~ b l q(t) § T ’ (t) J T P(* ~hr- t)e~ ar dr dt 

fP-T poo { T+t 

+ T1 t »«> £ *«> /, >><«' - },, - ()«-" * dl 

Here, 6 is so defined as to make 

00 

Z n = l 

jw v(j 1), and we have the simpler 
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expressions 

f q (t)[e-° T -e-“‘]dt 

o a J t 

aT /•**+* / 0 \ 

r>i = -T- [ «(0 [ Vd - !> T — f ) e 
0 Jo r 

aT /*oo /*T4-f 

+ — / g(<) / p(i — 1, t — t)e “ T dr dt 

b J t Jt 

Observe that since F(y) is a weighted sum, with non-negative weights of convex 

functions f(y — i),F(y) is convex. . 

Let L(x) denote the total expected discounted cost over the infinite time 
horizon when an optimal policy is followed and x is the amount of stock on 
hand and on order immediately after a demand has occurred but before a deci¬ 
sion whether or not to order is made. Denote by y the amount of stock on hand 
and on order after an order is placed. Clearly y( S*) should be chosen so as to 
minimize 

kd(y - x) + F(y) + Z TTi(t)q(t)e- a ‘ dt^Uy - i) (3) 


where «(*) = ]’ 

U ,2 > o. 

Hence 

L(x) = min ^kS(y - x) + F(y) + Vi(t)q(t)e~‘* dt^L(y - *)]* 


Define 


where 


f Vi{t)q{t )6 at dt — a/Pi 

Jo 


a = f q(t) 
Jo 


e~ at dt 


tp< = ]: rt^(0g(t)e 

js—0 O Jq t=0 


dt 


I f q(t)e~ a ‘ dt = 1 
a Jo 


Since 
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and Pi > 0, pi may be regarded as a probability. Now 

L(x) = min£fca( 2 / — x) + F(y) + a g p<L(y — ( 4 ) 

In the special case of unit demands 

L (x) = min [kS(y - x) + F(y ) + aL(y - 1)] ( 5 ) 

If maintaining an inventory is at all optimal, there will be integers s, S(S > s) 
such that if x ^ s, the minimizing value of y in (5) is S while if x > s, the 

minimi^ value of y in (5) is * [6]. We then obtain the following difference 
equation from (5): 


L(x) = F(x) + aL(x - 1) 
L(x) = k+ L(S) 


Thus 


x > s 
x s 


L(x) = g F(x - i)a: + + L(S)) x > s. 

In particular, 

D-l 

L{S) = S ~ i)a< + aD[k + L< > S K where D = S- 


or 


L(S ) = 


afk + X} F(y — i)a 
0 


1 — a D 


When s + 1 > 0 as we may assume in all practical cases 
L(0) = k + L(S) 

D-l 

k + 23 F(y — i)a 


L( 0 ) = 


1 - a D 


( 6 ) 


^i°Zpr^“ ,i " iVed ^ W ^ Special 0866 th '“ d <““ d “ *^<1 by 

Returning to the general equation (4), observe that it is an Arrow-Harris- 

• + l Ch wi, eqUatl0n Smce F( ~ y ^ is convex > Scarf’s theorem [6] applies stat 
mg that the optimal policy is an s,S policy J PP ’ 


y = 


S if X 2* 8 
x if x > s. 


Let denote the nth convolution of p, and let wji) be the probability that n 



IX-34 —AN INVENTOKY MODEL 


429 


trials are required to yield a total demand of i units or more. (This is of course 
based on p,). Then the solution of (4) is 

L{x) = jt( E^ a n pi n) F(x - i) + a n w„(x - s)[k + L(£)A, x A s 

n*0\ imm 0 / 

In particular, 

fc E a n Wn(D) +EE a n pi n) F(S - i) 

US) = —- 5=^=2 - 

1 — E a n w u (D) 

n=0 


and if s > 0 


L(0) = L(S) + k 


k + E E a n pl n) F(S - i) 

L (0) = - n = 0_i=0_ - 

1 — E a n w n (D) 

71— 0 


(7) 


4. Discussion of the Loss Function 


A. Alternative Formulation 

In Equation (7), the timing and accumulation of demand appear separately- 
The first is expressed by a and its powers, the second by the probabilities p\ n) 
and w n (D). An alternative expression can be given which refers to the basic 
probabilities p(i, t), where demand equals i in an interval of length t (which 
starts immediately after a demand), which describe the stochastic process. 
We will show in the appendix that (7) is equivalent to: 


UO) 


k + E 


F(S - i) 


f p(i, t)( 

Jo 


dt 


u—i. 

a / E Vih 0 

Jo 4=0 


e -0 ' dt 


( 8 ) 


The relationship of p(i, t) to the probabilities q{t) ,m(t) of the previous analysis 
is exhibited by the definition of p{i, t) on p. 425. 

Formula (8) is convenient when the detailed structure of the process that 
generates demand is not known, but the demand probabilities for time intervals 
(after demands) of any length are known. 

The numerator of this formula is intuitively obvious; it represents the ex¬ 
pected value of storage and shortage cost, properly discounted, through an 
inventory cycle. To interpret the denominator, note the identity 

r oo -D-l 

/ q(t, D) dt = E P(h f )> 

J t +0 4=0 

3 A general discussion of the relationship between the spacing of events and the cumula¬ 
tive number of events that occur during an interval of time may be found in [7]. 



430 


IX-34 —STOCHASTIC DECISION MODELS 


where q(t, i) is the probability that demand first equals i at time t. Differ¬ 
entiating with respect to t, 


Consider 


-q(t,D) 

i=*0 dt 


( 8 ') 


m*> OO 

= <*/ 2 p(i, t)e~ al dt, 

Jo *=0 

after integration by parts. Thus, 

^ + 23 i F(S — i) f p(i, t)e~ at dt 

L( 0) =_ *=° b m Jo 

1 - f q(t, D)e~ at dt 
Jo 

The meaning of this denominator becomes clear when we use the approxi¬ 
mation e = 1 - a dt, which is legitimate since a is small and q(L D) is 
negligible for large t: 

[ q(t, D)e~ at dt* 1 - a f° tq(t, D) dt = 1 - <d, 
u Jo 

say, where t denotes the average length of an inventory cycle. Thus, 

^ + 23 1 /bF(S — i) f p(i t)e~“‘ dt 

L(0) = -_ k 


( 8 ") 


at 


Up to a factor of proportionality, Equation (8") and hence, (8), is approxi¬ 
mately the average cost of the system per unit of time. The same can be shown 
with Equation (6) or (7). In terms of f{y), Equation (8) is given by 


U(0) 


_ k + e aT § Xj /OS - i - j)r s Jf p(i, t)e~ 


dt 


f p(i,t)e-*dt 

i=*0 Jo 

Finally, for convenience, let 

e_aT l P( *’ t)e ~ a ‘ dt = e ~“ Tb £ a >*' n> by (7), (8) 


whereupon, 


= Ui, 
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5. Reorder Point and Reorder Quantity Determination 

A. Determination of s and D for Unit Demand 

Before deriving general formulas for the s and D which minimize the loss 
function, it is convenient to consider these parameters in the case where de¬ 
mands are for one unit at a time. For simplicity, a is set to zero so that average 
cost is minimized per unit of time (see page 430, above). 

The definition of Ui shows that since p[ n) = 8 ni , a = 0, and a = 1, then 
m = b. Therefore, 

D —1 oo 

k + b X X rjf(s + D - i - j) , Q , 

lim «L(0) = - ‘"° ,= ° ~Tn- = L.i> , W 

<*-►0 OV 


represents indeed average cost per unit time, b being the average time between 
successive demands. A simpler expression for (9) is 


L s , d 


D-l 

k + X* F(s + D — i) 

l=a 0 

_ 


( 10 ) 


The optimal s is a smallest integer for which 

L>+i,d — L,,d > 0; 


i.e., 

X A,F(s + D - t) = F(s + D - 1) - F(s + i) > °, 

i —0 


or 


X +1 — y) - /( s +1 — y)] ^ o. 


3=0 

The storage and shortage costs will now be specified as follows: 


f(y) = 


hy 

-gy 


y > 0 
y < 0 . 


(ID 


( 12 ) 


The inequality (11) assumes the form: 


X h- (S + 1 — y)ry + X <7- (J' ~ S - l)rj - Z > (« + 1 “ 3 )»7 

y*0 j-S+1 J- 0 

- X g-U-s- l)ry>0. 

y—«+i 


Since X r y = 1>this becomes 


3=0 


(.h + g) £>Z + (A + 0 ) X (£ 4- 1 — j)»"y — gD > 0. 

y=o jw+i 
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Writing 


Ry 


i* o 


1 0-1 

“I" £ X/ ^*+2+1 ^ 


<7 + h 


(13) 


This may be interpreted as follows: The left-hand side represents a weighted 
average of R v , say R s+S , where 0 < 6 < D. Expression (13) says that the 
probability of demand exceeding s + 6 should be kept below h/(h + g), a 

small fraction since generally g is large compared to h. A conservative solution 
is to set d = 0, and have R 8 > g/(g + h). 

Similarly, the optimal D is the smallest integer for which L 8tD +i — L 8iD > 0; 


Az> [S F(s + D - i) - (k + £ F( s + D - f)^J > 0, 
which implies 


g [F(S + 1) - F(S - *)] > Jfc. 


Substituting for F, 

§ S M/(,S + 1 j)] > 

and substituting for/, 

JD-1 f S—i—1 s 

SIS A ' (1 + *>' + ft -(5 + 1 - 3) - g-(i +j - 8)]rj 

~ E 9' (1 + Or,} > 

j=a+i j u 

By definition of r if r, is small for large j. If, as a rough approximation, ro¬ 
sette zero for j > s , then (14) becomes simply 

k 


(14) 


is 


E Hl+i) > * 

t»o b 


or 


hD(D + 1) 

2 " ^ kilty 

where m = 1/6 is the average demand per unit time. This may be 

r\f t\ i t \ ^ 2km 


may be written 
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or 


D = 



2 km 

~h~’ 


(140 


which is the Wilson lot-size formula. Without this simplification, (14) leads to 
£ {[(i + l/lf >7 + E_. (S + 1 - i)ry] (g + h) - (t + l)sj > \ (15) 


JB. Determination of s and D in General 
Substituting (12) for f(y) in (8'"), 

ke aT + 22 Mi |~22 (S — i — j)rj + 22 g‘(i + j — ^Jr/l 

T 1=0 Li=0 _______ j=8—i+l __J 

" ““ -—— D_1 

a ^2 u i 
1=0 

Adding and subtracting <7- (i + j — 5)*j , 

k + ]C (A + g(S - i - j)rj + g-(i+J - 5)1 

r i=o Li-o___=1 

^ — — D—1 ' ’ 

a 22 Mi 

i=0 


(16) 


where J — 22vLn ir,-. Setting S — s D and R v 22*-=o r * > 

k + £ M,[(h + g)R, +D -i -1 + g-(i +J - S - D)] 


D —1 

a 22 M f 

1=0 


Differencing with respect to s yields the following condition of optimality: 
The optimal s is the smallest integer for which 


D —1 

^ y V>i P>s+D —i 

1=0 _ 

D —1 

i=0 


> 


0 + h 


(17) 


This formula is reminiscent of the “newsboy” equation, 


Ps = 


h 

g + h 


(170 


Differencing (16) with respect to D, the following condition for optimahty 
results: 
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a g Ui^UzKh, + g)R, + g(j - 1 - S )J + g u .[( h + g )R t+B _ i ^ _ ? j| 

/ D ~ l 

- om D jfc + g Ui [(h + g)R, +D _i + g(i + j _ s _ £)]! > o_ 
From this, the optimal D is a smallest integer for which 

D—l r~ 

g U(D - t - 1) - (h + g) £ fly UiU D 

L y-t+i J 


r "I D—1 

+ 2-/ (A + g)R*+D-i — g UiY, Uj 
L J y-o 


( 18 ) 


> ku D e a \ 


Notice that by (17), the second term on the left-hand side of (18) is nearly 
zero but non-negative. A good approximation for (18) is therefore 


d— i r 


g [g(D - i - 1) - (h + g ) £ fl.+y’L > ke aT . 

y**i 


(18') 


nk , C r ^ T8e 8 and D are , the j° mt solutions of the equation system (17) and 
( ). e numbers r y and u t are known constants whose properties and values 

for special distributions will be discussed in Section 6 below. At this stage 

m Wnd 1 n «7T“ 5 M 0ns can be made concemin g efficient ways of solving 
il 7 / , , ( * 8) ’ Presumably, some iteration method is appropriate, starting 

b^ eq?atkn a70 m ^ WilS ° n l0t ' SiZe f0mmla (U>) and the news ' 

C. Special Case of High Value, Low Demand 

tw 1 ?!! ite f S T 3 ™ fficiently costIy > compared to the fixed ordering cost, 

r ar f, ^ dere d each time a demand has occurred, implying that D = 1. 

Conditions (17) and (18) specify the precise conditions under which this is 
the optimum policy. For this case, 


R.+i > ~~ and R, +1 > g + fe ° f (VW) 
g + h * +1 - g + h 


(19) 


c^ZToHowf® “ ““ dea °“ i °” ° f ’ 1,kk sma11 ' lhe 

Low demand may be taken to mean that stocking one unit is sufficient. A 
necessa^^condition for this to be the optimal policy is that (19) be satisfied 


6. Special Distributions 
A. Demand for One Unit at a Time 
Generally, if demand occurs one unit at a time, 
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m = ba 


T i*T 

r'i = f q(t)e~ at I p(i - l,r)e~ aT dr dt 
Jo J T—t 

+ r C p(* - 1. ^)« _or dr dt. 

J T Jo 


Integrating the first integral by parts yields 


br'i = e~ aT jJ p(i - 1, T - t)e tt, J“ gO)^ 1 dr dt. 

Define Q(t) = f q(r) dr and observe that since demand is for one unit at a 
Jo 


pH, 0 = Q w W - Q (i+1> (0» 

where Q (,) is the tth convolution of Q. Disregarding discount factors, 

br'i = [ T [Q u ' _1) (r - t) - Q {i) (T - <)][1 - Q(01 dt. 

Jo 


f T Q(t)Q M (T -t)dt= f T Q <n+1> (0 dt, 
Jo Jo 


the following results: 

br'i = f T lQ <i+1 \t) - 2Q H \t) + Q (i_1) (0] dt 
Jo 


1. Gamma Distribution of Interval 
Consider the gamma distribution 


(xO * 1 " 1 . 


« ( »-T53- Xe ' • 


with integral m. Since 


g Ci> (0 = 


(xO”*’” 1 . 


Q (i, (0=i - 

i=0 J i 

mi-— 1 

= 1—22 Pi(xO, 

y=o 


and 
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say, then 


n m(i-fl) — 1 mi—l m(t-l)-n 

- £ + 2 £ - £ p^u) 

j=o j=o y=o J 

n mi-l m(i-fl) — i“i 

. £, - £ Pi(x<) <&. 

1) j—mi _] 

j p,(xt) <ft = £ [i - i p.(xr)J 


i mi-f-m—1 .j mi—X 

^'< = r £. Pj(\T)-\ £ p,-(xr) 

A j=mi A j=mi—m 


mi—1 y-fm 


= r.Z £ p„(XT). 

A j=mi — m n=j-j-l 


Since 6 = m/X, 


1 ”>*— X J+m 

Z P-(xr). 

riL j=m(i— i) n=y-fl 


In particular, for exponential interval distributions, m — 1 and 

r'i = p,(xr). 

B. Multiple Demand 
1. Fixed-Length Intervals 

Let 6 denote the length of the period between successive demands. 
Case a: 6 > T. Disregarding discounting, 

bri= X *7 p(* - j, t ) (ft (* > 0) 

„ (*y = *y(0)) 


f p(iy t 6) dt . 

Jo 


But since 6 > T, there is exactly one occasion for demand in the in¬ 
terval T < t < T + 6, namely at t = 0. Hence, 6r t - = Now 


h = f tq{t) dt = 0. 

Jo 


Thus 
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Ti = 


T 
e ' 


hr. 


= ir° T + / (f - T)q(t) dt= t 0 T + 6 


T 


= 0+ (to - 1) T 

To = 1 - (1 - to) J . 


Case b: 0 < T. 


i *T 

bn = / p(i - i, <) 

y=o J r—0 

,T-e 

== / p(i, 0 

j f 



where [x] denotes the largest integer not exceeding x. If 6 divides T say T = 
N8, then ’ 

hr, = eA N) 


In any case, 


* = r P(h t ) e-“ f A = b Z a\! n) = b J 

J 0 n=0 n=0 

when discounting is disregarded. 

When demand occurs at regularly spaced points in time, essentially a period 
model is obtained. The present model, although formulated in continuous 
terms, contains the period model as a special case. 

The following distributions for %i are of particular interest in this connection, 
a. Geometric Distribution 


= (i = 0,1,2, • • •) 

+ i (a negative binominal distribution). 

u , - (_jl_y 

1 — aq \1 — aq / 

when discounts are disregarded. 


1 

V 
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b. Geometric Distribution With to = 0 

Ibis distribution is obtained when a replacement part will fail with prob¬ 
ability p upon installation. The probability that i parts are needed to a 
successful replacement is given by the distribution 

*i = qv i ~ 1 (i = 1, 2, 3, •••) 

The generating function is g(x) = qx/( 1 - px), 

g n (x) = gV*( 1 — px)~ n 

This is a negative binomial in terms of the variable i _ n. 

Ui = aq(p + aq)'~ l 

~ Q if discounts are disregarded. 

c. Poisson Distribution 


In) _ (np) 1 e~ 

7IV--- 


Ui — jt nc n , where c = ae p . 

% l n —0 

the'serieJ ^ eValUated by Successive application of the operator c{d/dc) to 


It can be shown that 


T=~c = £ c " ( l c I < D- 


£ nc n = *L<{c) 

n=0 (1 — C) l+l 9 


mid add^up to^ f ^ ynom ’ a ^ c degree h whose coefficients are all positive 

2. Poisson Process with Independent Increments 

nurM^ , in + t , erVals between ^mand be exponentially distributed, q(t) = X<T x ‘, 
and let the quantity demanded be independently and identically distributed 

Tj ' * * 1S 6 f y Sh ° Wn that both mean Md variance of this process are 
strictly proportional to time. 
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-r»~f 

Jo Jo 


1 

a + X 


* r [T+t 

bti = X) Tj [ Xe~ xt / p(i — j, T — t) e~ aT dr dt 

j-0 L^O J T 

/ co pT+t "1 

\e~ u J p(i — j,T — r) e““ T dr dtj , 


which, upon integration by parts, becomes 


u = S - tt [ Xe x ‘ p(f — j,T — t) 
j =o a T A Jo 


a + X 


p(i,T), 


so that 

= p(«, 3 1 )- 

The generating function for this process is 

G(x, t ) = e~ l,+Ue(x) . 


Hence, 


2 = f @( x > f ) 

i«=0 */Q 


e~*‘ dt 


X — (a + X)g(x) 

= 1 1 
a + X 1 — a<7(:c) 

= b Z E aM n V, 


so that again as in the fixed interval case, 


E n (n) 

a iTi . 

n=0 


The expressions for m obtained in the fixed interval case are therefore applicable. 
New calculations are needed, however, for the compound distributions 

( • rp\ V' (XT) n (ti) 

n = pU, T) = —7- • 

n =0 nl 
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a. Geometric Distribution: Stuttering Poisson Process 


ir< = qp 


p(i, T) = p* e~ XpT £ ~ (XqTy 

y-l \J — 1/ j ! 


b. Geometric Distribution With x 0 = 0 


= QP ’ 1 (i = 1,2, • • •) 
p(i, T) = p i e~ XT 2 ^ Ty . 

j-o ! 


c. Poisson Distribution 


i oo i 


1 - n=0 711 

where c = X Te~ p . 


3. Gamma Distributed Interval 
Let 


. CxO” -1 Xe~ x ‘ , , . 

= —fTwi- and = *7 • 


Omitting discounts, 


6 = r< 2 (o=< = j 

Jo X 

x r T i r T+t 

r<== ml 5(0 5 **J T P^~3,t-r)drdt 

X r°° * rr+* 

+ mi r 9(<) g *>J t P«-i,T-t)dr t 


Since fe(<VJ (B) = g (B) (<)^ n) , 


p(*,0 - Z!. / g (B) (r)^ n) [1 - Q(< _ r )] 

71=0 JQ 
00 

= S Ki n) p(n,t) (see Section 3, A). 

n=fl 7 7 


Thus, whenever *■{(«) = Xf , 
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r> = r f qit) 52 f f qin, r) dr 52 v\-i *vl dt 
o Jo »=o L J r-( /-o J 

+ r f q(t) X [ f q( n > T ) dr 52 vil] Tyl dt 
0 Jt n-0 L>A) i-0 J 

= t f q(t) f 52 q( n > , r)iri n+1) dr dt 

0 Jo j T-t n=0 

+ r f <?(0 f 52 q(n, r)vi n+1) dr dt 

0 J T Jo 0 

For gamma distributions of demand intervals, 

m n+m —1 

p(n, t) = 52 Pii^T) 

jsasmn 

^ jP( m-f-l)n—1(^0 Pmn—1(^0 

qit) = Xe ~ Xl = Xp ”- l(Xf) 

r, = i f 52 [P {m +i)n-i(M) ~ Pmn- i(Xr)k< n+1) dr dt 

0 Jo J T—t n=0 

+ ^ f f X) \P (m+i)n-i(Xr) Pmn—i(Xr)]x» dr dt 

0 J T Jo n=Q 

The integrations can be carried out explicitly leading to expressions composed 
of Poisson terms. Unfortunately, they appear to be too complex to be of much 
interest. 

7. Conclusions 

Although this model carries the analysis of optimal inventory policies to 
more general distributions than have been considered before, it is still inadequate 
to cope with the situation where demand is generated by a fixed number of 
customers, each using a non-trivial S,s policy. For then, the distribution of 
demand is not independent of the timing (and quantities) of demands that 
preceded the last demand. Even when the decision-maker chooses to disregard 
the additional information contained in the timing of past demands, thus re¬ 
stricting himself to an S,s policy, the analysis cannot disregard it in construct¬ 
ing, say, what corresponds to the present function F(y) during a stock cycle. 
The present dynamic programming approach, which is crucially dependent on 
the irrelevance of the past beyond the last demand, cannot be extended to 
these non-Markovian processes in any obvious way. 

Appendix 


r qit, D)e~ at dt = 52 w D in) (" q M it)e~ a ‘ dt. 

Jo «—1 Jo 


(A.1) 


Note first that 
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f dt ~ f «” - r) dr d, 

’[I ^“'i”"lr )qU - r )e- : — ! dr dt 


The variable transformation 


, i — T = t', yields 


l q W (t)e-°‘ dt = £ q(/) e -°‘' dt' f 


e^q^ir) dr 


e—q^it) dt. 


f e at q a) (t) dt = I” e~ a ‘q(t) dt = a. 
Jo 


Suppose that 


f e- ai q^(t) dt = a"" 1 . 

Jo 

Then (A.2) shows that f e~«q">(t) dt = a" is true for all » by induction. 


Hence, by (A.1), 


60 oo a 

S Wo{n)ar = £ «*<») j[ e U) (Oe- at df = f~ q(t, D) 


e~ at dt. 


Consider next. 


oo t 

V= §, l f ,r i( r )2( T )] Cn, [l - Q(t - t)] dr. 

S P™ babmfc y that - at _ time t, (past) demand equals f units is the probability 

dinrl f. me P r 0U fr e ^ n trials were com P leted which resulted in a total 
demand for t umts and that, smce then, no trial has occurred. In order to evaluate 


l «-p«, 0 dt - t jf «- 1‘ b(r)„(,)ni - Q(t -r)]dr dt, 


consider 


l e l [?( 7 ') ir *( r )] (n, [l - Q(t - r)] dr dt 

= I f b(OT f (r)] l " : 

Jo Jo 


3 *~”e - Q(t - T )] drdt. 
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With the variable transformation, t' = t — t, the above becomes 

= f [g(r)7r i (r)] <n, e-“ T dr f e^'ll - Q(t')\ dt' 

Jo Jo 

= b r [g(r)r f ( r)] M e~* T dr, 

Jo 

by definition of 6. It may now be shown by induction that 

f [g(0ri(*)] u, <r“‘ dt = a n pf. (A.3) 

Jo 

For n = 1, 

f q(t)n(t)e- at dt = f p(i, t)q(t)e" ai dt 
Jo Jo 

= api 

by definition of a and p { . Suppose (A.3) is true for n — 1. Substituting, 

[T,(09«)] <n) = E [‘ [g(r)^(r)] Cn - 1) 5« - - r) dr, 

;=0 Jo 

then 

f [T i (0fl(<)] , " > «"“ dt = ± r [q{r)* j {r)T- l) e- ar dt f ^j(t')q(t')e- at ' dt'. 
Jo i=0 «/q Jo 

= ta^r«PH 

J=0 

= a n pi n) . 

Thus, finally, 

r p(i, t)e~ at dt = b± a n pi n) . 

Jq n=»l 

This completes the proof of Equation (8). 
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_r£ Ct r n 3 ab r 0Ve bas . been revised slightly in order to correct errors in the 
-The Editor"] SeCti ° n 4B ° f the original paper has been omitted. 
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OPTIMAL POLICIES FOR A MULTI-ECHELON INVENTORY 

PROBLEM* 

ANDREW J. CLARK 1 and HERBERT SCARF 2 

I. Introduction 3 

In the last several years there have been a number of papers (Reference 1) 
discussing optimal policies for the inventory problem. Almost without exception 
these papers are devoted to the determination of optimal purchasing quantities at 
a single installation faced with some pattern of demand. It has been customary 
to make the assumption that when the installation in question requests a ship¬ 
ment of stock, this shipment will be delivered in a fixed or perhaps random length 
of time, but at any rate with a time lag which is independent of the size of the 
order placed. There are, however, a number of situations met in practice in which 
this assumption is not a tenable one. An important example arises when there 
are several installations, say 1, 2, • • • , IV, with installation 1 receiving stock from 
2, with 2 receiving stock from 3, etc. In this example, if an order is placed by in¬ 
stallation 1 for stock from installation 2, the length of time for delivery of this 
stock is determined not only by the natural lead time between these two sites, 
but also by the availability of stock at the second installation. 

In this paper we shall consider the problem of determining optimal purchasing 
quantities in a multi-installation model of this type. First of all, let us remark 
that once the parameters of the model have been specified (lead times, purchase 
costs, demand distributions, holding arid shortage costs, etc.), the optimal pur¬ 
chasing quantities may, in theory at least, be determined. The obvious way to 
proceed would be to define a cost function for each configuration of stock at the 
various installations, and in transit from one installation to another. We then 
remark that this cost function satisfies the type of functional equation which al¬ 
ways appears in inventory theory, and from which the optimal provisioning poli¬ 
cies may be determined by a recursive computation. It is clear, however, that 
this procedure is in general completely impractical since it necessitates the re¬ 
cursive computation of a sequence of functions of at least N variables. 

The question is, therefore, whether the obvious recursive computation of op¬ 
timal policies may be simplified for our multi-installation problem without com¬ 
promising the optimality of the solution. The answer is that such a simplification 
may be obtained if several very plausible assumptions are incorporated in the 
model. With these assumptions, it will be demonstrated in this paper that the 
solution suggested by Clark in Reference 3 is indeed optimal. The solution will 
be described in detail below. It should be remarked here, however, that the virtue 

* Received October 1959. 

1 Planning Research Corporation, Los Angeles, California. 

2 Stanford University, Stanford, California. 

3 This work was supported by the Bureau of Supplies and Accounts of the Department 
of the Navy. 
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of the solution given by Clark is that it permits the optimal levels to be computed 
separately by precisely those techniques which have been used in the past for 
the computation of optimal policies at a single installation. 

In Section IY we shall discuss various applications of the multiple-installation 
technique to problems in which several installations have the ^ supplier. The 
type of complex discussed in Section III may be described by the scheme: 

[]-[]-[]■••[]->[], 

N 2 1 

whereas the complex in Section IV has the scheme 

[ j ... 

[ ] 

[] []•••• 

\ 

[]-»[]■•• 

Unfortunately, the results for the latter type of complex are not as satisfactory 
as those for the former. 

2. The Multiple-Installation Model and a Description of the Solution 

Let us begin with a review of the model to be used for a single installation. An 
extensive discussion of this model is given in Reference 2, and we shall summarize 
here that material which will be of use to us. 

A sequence of purchasing decisions is made at the beginning of a number of 
regularly spaced intervals. The cost of purchasing an amount z will initially be a 
general function c(z), though we shall subsequently restrict ourselves to certain 
special cases. Delivery of an order occurs, say, X periods after the order is placed, 
at which time the stock on hand is augmented by the amount of the order. Dur¬ 
ing each period the stock on hand is depleted by an amount equal to the demand 
during the period, which is an observation from a distribution with density func¬ 
tion the demands being independent from period to period. (The demand dis¬ 

tributions may actually differ from period to period.) 

In addition to the purchase cost, it is customary to charge several other costs 
during each period. The first of these costs is a holding cost, proportional to the 
stock on hand at the beginning of the period if it is positive; and the second, a 
shortage cost proportional to the deficit of available stock at the end of the 
period if there is such a deficit. If the stock on hand at the beginning of the period 
is x, then the cost during the period, exclusive of purchasing costs, is given by 
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where h and p are the marginal holding and shortage costs, respectively It is 
useful for us to introduce occasionally more general holding and shortage func¬ 
tions than the linear ones described in Equation (1), and for these functions 
there will be an analogous form for the one-period cost L{x). 

Any policy (sequence of purchasing decisions) produces a sequence of costs. 
Costs which occur n periods in the future are discounted by a,n amount «*, so 
that we may form a total discounted cost as the result of any policy The optimal 
purchasing policy is that one which minimizes the total discounted cost. 

Let us consider an inventory problem in which there are n periods remaining, 
with xx units of stock on hand, tm units to be delivered one period m the future, 
and generally v>j units to be delivered j periods m the future, where j 0, , 
2 ■ • X - 1. Let C„(Xi, wi , • • • , m-i) represent the expectation of the dis¬ 
counted costs, beginning with such a configuration of stock and following an 
optimal provisioning scheme. Using the type of reasoning employed m Reference 
2, this sequence of functions may be shown to satisfy the following functio 

equation: 


( 2 ) 


C n {x i, Wl , • • • , wx-i) = Mm |c(z) + L(xi ) 


+ a f C n -i(xi + u>i — t, Wi , • • • , wx-i , z)<£(0 dt 
Jo 


where the minimizing value of z is the optimal purchase quantity for the given 
stock configuration. In the writing of this equation we are explicitly assuming 
that all excess demand is backlogged until the necessary stock becomes avail¬ 
able. This equation has been analyzed in considerable detail and we shah quote 

for future use those facts of relevance to us. £ . f ,v , . 

1. The optimal policy (i.e., the minimizing value of z) is a function of the total 

stock on hand plus on order, regardless of the dates of ,^J°£e 

depends crucially on the assumption that excess demand is backlogged. Mo 

over, it may be shown that 

wx-i) = L(x x) +«[ Ux, + Wl - 0*(0 <«+•■• 

(3) + a x-i j*... f l(xi + wi + ■■■ + wx-i -ti--"- k-i)4>(ti) 

Jo Jo 

... dtx ■ ■ • -t-/»(zi -t- • ■ • + Wx— i) , 


and that /„ satisfies the functional equation 

/»(«) = Min { c(y - it) + a x jf • • - j[ L{y - k - • • 




■ ■ ■ «^(ix) dti ■ ■ • dtx + a J fn-i{y - t)4(t) dt |. 


(4) 
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If y is the minimizing value in Equation (4), then y* - u is the optimal pur¬ 
chase quantity, where a* + «*+.••+ «^ = u . (Obvious modifications in 
Equations (3) and (4) are required when n is less than the time lag.) These re¬ 
sults (Reference 2) permit us to reduce the inventory problem with a t.im P lag 
to one in which essentially no lag exists. 

2. The results mentioned above are valid for any ordering function c(z), 

whenever excess demand is backlogged. Now let us restrict our attention to the 
cost function 


(5) 


cw-l K \ c ' z - 2> ° 

0 ; z = 0 


(K is the setup cost and c the unit cost.) Let us also assume that the one-period 
costs 


~ 1 i"'l L{y ~ k -fcMfe) 


4>(k) dti • dt\ 


are convex. (This is certainly correct if the holding and shortage costs are linear, 
and m other cases also.) Then there exists a sequence of critical numbers (S n , s„) 
so that in period n it is optimal to order only if x, + • • • + «*_! < *, and if we 
do order we order an amount S n -(x 1 +-..+ (Reference 4). The specific 
torn of the one-penod costs is irrelevant; we can require only that they be convex. 

3. An additional simplification occurs if K = 0. The upper and lower critical 
numbers become the same and it is customary to denote their common value 
by x„. I he optimal purchase quantity is given by 

Max (0, x n - ( Xl -f • • - + 

In this case somewhat more is known about the properties of the functions 
(Reference 2) ^ W 1S always convex > and in addition, /'„(«) = -c for u £ x n 

m ^Tw T S™ f ttention t0 the description of the multiple-installation 
model. We shall make the following assumptions: 

Assumption 1: Demand originates in the system at the lowest installation 
(installation 1), and at no other point in the system. 

Assumption 2: The cost of purchasing and shipping an item from any installa¬ 
tion to the next will be linear, without any setup cost. The only exception to this 

assumption will be at the highest installation, at which point a setup cost will 
be permitted. 

shmZr P< T l0WeSt instaUation (installation 1), a linear holding and 

nrnW a b f operatlve ’ in the same manner as the single-installation 

^, b e f m dcscribed above. We make the assumption that holding and shortage 

sec?n^ni e n T rf Uat i° n d ° ^ depeDd 0Qly 0n the stock on kand at the 
second installation, but are functions of this stock, plus stock in transit to the 

fot mstaUatmn plus stock on hand at the first installation. Generally speaking, 

stock at thfltk? sb0I J tage 5 )st ® at any level ^ii b e assumed to be functions of the 
stock at that level plus all other stock in the system which is actually at a lower 
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level or in transit to a lower level. We shall call these costs the natural one-period 
costs at the level. They may, of course, be equal to zero. 

Clark in Reference 3 has given the name “echelon” to the system consisting 
of the stock at any given installation plus stock in transit to or on hand at a lower 
installation. The echelons will be numbered according to the highest installation 
in the echelon. Our assumption may be stated as requiring that the one-period 
costs be functions of the echelon stock rather than installation stock. The simpli¬ 
fications described in this paper are very crucially tied to this assumption and 
assumption 2. 

Assumption 4: Each echelon backlogs excess demand. 

With these specifications in mind let us turn our attention to the determina¬ 
tion of the optimal provisioning levels. The solution suggested by Clark is best 
described by means of an example. We consider the case of two installations. 
The natural lead time from installation 2 to installation 1 will be two periods in 
this example. Let us denote the stock on hand at installation 1 by Xi ; the stock 
to be delivered one period in the future by toi; and the stock on hand at installa¬ 
tion 1, plus on hand at installation 2, plus in transit from 2 to 1, by xt (i.e., x 2 
is echelon 2 stock). The one-period costs at installation 1 will be denoted by 
L(xi), and those at echelon 2 by L(x 2 ). The unit shipping cost from 2 to 1 will 
be denoted by e x . 

We begin by solving the problem (that is, determining the single critical 
numbers Xi = 0, x 2 = 0, x 3 , x 4 , • • •) for installation 1 without any reference to 
the remaining parts of the multiple-echelon system. In other words, we solve the 
single-installation problem for the lowest echelon, assuming that delivery of any 
order, regardless of its size, will be effected in two periods, and using in our calcu¬ 
lation’s a unit cost equal to the transportation cost from the higher echelon, with¬ 
out any reference to the original purchase cost. This would suggest that if at the 
beginning of the n tb period the stock on hand plus on order at installation 1 is 
less than x n , we order the difference; and if the stock is larger than x n , we do 
not order. The problem is, of course, that there may not be adequate stock at 
installation 2 to fill such an order. In the solution given in this paper, it is shown 
that we ship only that part of the order for which there is available stock at the 
next highest echelon. This describes the optimal policy at the lowest installation 

(Theorem 1, below). . 

The next question is that of the optimal quantity of stock to bring m at echelon 
2. It will be shown in the next section that the optimal purchase quantities at 
echelon 2 are functions only of z 2 , the stock at the two installations plus the 
stock in transit. Moreover, the optimal policies for this echelon may be com¬ 
puted by the standard single-installation model using the ordering cost appro¬ 
priate to this echelon, and the natural one-period costs described above (L(%)). 
The important idea is that we must in some fashion introduce a penalty at this 
echelon for keeping a quantity of stock on hand which is insufficient to meet 
the normal requests from the lower installation (Theorem 2, below). The pro¬ 
cedure for doing this is quite simple: We merely introduce an additional one- 
period cost at the second echelon which is precisely equal to the expected incre- 
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ment m total cost at installation 1, because the stock at echelon 2 is inadequate 
to bring the lower level’s stock up to the required point x n . 

In the example that we are discussing, the specific form for this additional 
one-period cost may be found as follows: We recall the definition of the functions 
C„(Xi, wi) to be the minimum expected discounted cost at echelon 1 if there are 
ri periods remaining and if the stock on hand is a* ; and the stock on order, Wl . 
(Ihis function is to be computed on the basis of an ordering cost equal to e, 
the transportation cost.) For » = 1, , Wl ) = £(*), and also C 2 (x,, Wl ) ~ 

, 1 “•* ° + Wl ~ dL In this expression for C 2 , the first term repre¬ 

sents the expected one-period costs in the immediate period, and the second 
term represents similar costs for the next period. Inasmuch as delivery of any 
order takes two periods, there is no modification that can be made in these costs, 
or n > 2, we use the decomposition described in Equation (3); that is, 

(6) CJxi , wi) = L(xi) + a j f L(xi + Wl - t)<f>(t) dt + f n ( Xl + wj. 

The first two terms on the left-hand side are as described above; the third term 
represents the optimal cost exclusive of those costs which it is impossible to 
modify by a request for a shipment. 

As in Equation (4), the functions /„(«) satisfy 


Uu) = - M) + « I fay ~ 4 - 4 )<*>( 4 )<*>( 4 ) dh dh 

+ a J o — t)<f>(t) dtj •, 

ndDindri^ value is ft In other words, if * + «* < ft so that ordering 
occurs, the minimum cost will be 


ci(x n - u) + a 2 J J L( Xn — ti — h)<t>(h 


)<£(4) dti dh 


+ a / /n-l(fn — t)4>(t ) dt. 
JQ 


If, however, x 2 , the stock at both installations, plus stock in transit, is less than 
wiU be 7 6 t0 P * 2 “ (Xl + Wl) and therefore the minimum cost 


ci(z 2 - u) + a 1 ffLi^ -ti- h)<P(h)<j>(h) dh dh 
(9) 

+ a f dt 

jo 

iwTrihmil^ ^ °\ C0UP f larg T than Expression and the difference in cost 
,, ttnhntab e exclusively to the msufficiency of stock at level 2. Therefore 

the additional one-penod loss to be charged to this echelon is given by Expression 
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(9) minus Expression (8), or 

Ci(x 2 — Xn) + a jJ[L(x 2 — h — t 2 ) — L(x n — h — t 2 )]<t>(ti)(j>(t 2 ) dhdt 2 

do) 

OC I [fn— l(#2 — 0 fn—l(%n t)]<t>{t) dt , 

JQ 

if x 2 < x n and zero if x 2 > x n . With this additional one-period loss to be charged 
to the second echelon, the optimal policy is then computed using standard tech¬ 
niques. Of course the specific values of Expression (10) involve the critical 
numbers x n and the functions / n (^), but these will have been computed already 
for installation 1. 

It is worth remarking that Expression (10) is a convex function of x 2 , so that 
the optimal policy for the second echelon will be of the ( S , s) type. 

If there are more than two echelons, the same procedure is repeated, always 
augmenting the natural one-period loss at the echelon by the increment in total 
cost at the lower echelon due to the lack of available stock. 

3. The Proof of Optimality 

In this section we shall prove that the procedure suggested in the previous 
section is indeed optimal. Because of notational difficulties, we shall restrict our 
attention to the example described in the previous section although the ideas 
are quite general. In order to be specific we shall assume the time lag in delivery 
to installation 2 to be a single period. 

Our approach will be to investigate the optimal solution for the entire system, 
and show that it reduces to the solution given by Clark. The first step is to write 
down a sequence of functional equations, analogous to Equation (3), but for the 
entire system rather than a single installation. We define C n (%i , Wi, x 2 ) to be the 
minimum expected value of the discounted system costs if there are n periods 
remaining; if stock on hand at installation 1 is xi ; stock in transit, Wi ; and system 
stock, x 2 . At the beginning of the period two decisions are made: the first, a 
decision as to how much system stock to order for delivery next period; and the 
second, a decision as to the quantity of stock to be placed in transit to installa¬ 
tion 1. The stock on hand plus in transit to installation I may be raised from xi + 
Wi to y , where y is any number between X\ + Wi and x 2 , at a cost of C\(y — xi — 
Wi); and if such a decision is taken, at the beginning of the next period stock on 
hand at installation 1 will be Xi + w x — t (t is the demand), and the stock in 
transit will be y — xi — wi . The system stock is, of course, not modified by this 
decision; it can only be changed by a decision to introduce z units into the system 
(at a cost of c(z)), and will become x 2 + z — t Therefore, if the two decisions de¬ 
scribed by y and z are taken, the inventories (xi , Wi , x 2 ) become (xi + w% — t, 
y _ Zl - Wl ? X2 + 2 — t), and the discounted value of expected future costs 
will be 


f 00 

a / C n -i(xi + wi — t, y — %i — wi , x 2 + z — t)4>(t) dt. 
Jo 


(11) 
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fa order to complete the accounting, we should consider the purchase (and trans- 

po a ion), olding and shortage costs. The purchase and transportation costs 
are given by 

( 12 ) c(z) + ci(y — xi — vh). 

The shortage and holding costs are given by 

(13) + L( Xl ), 

the terms of which apply, respectively, to echelon 2 and installation 1. 

j'" 1 u’ Wl ’* S ’ course > ec l ua ^to the minimum of Expressions (11) -j- (12) + 

( 3), when y and 2 are chosen optimally, and we therefore obtain the following 
functional equation: 


C n (x lt Wl , x 2 ) = Min ^(2) + Cl (y - an - vn) + L(x 2 ) 

(14) 1 

+ L(x 1 ) + a jf Cn-i{x x + vh - t, y - x x - w x , x 2 + z - ) dtj, 

with the condition Co = 0. 

Let us also introduce the functional equation which would be used to compute 
optimal policies for installation 1 in isolation. Let C„( Xl , Wl ) be the minimum 
expected value of the discounted costs for an n period problem at installation 1, 
which begins with an units on hand and Wi units in transit. We are assuming that 
the unit purchase price is the transportation cost and that all orders are de¬ 
livered m two periods. C n satisfies 


C n (x ly Mi) = Min \ci{y - x x — Wi) + L(x i) 

(15) l 7 

+ “ l Cn-iizi + Wl - t, y - Xl - Wi)<t>(t) dt 


Of course, the solution of Equation (15) is of no clear relevance to Equation (14) 
9/S yet. 

Cl(xi ’ and Cfa, Wl , x 2 ) = L( Xl ) + L(x 2 ). In other 

words Cx(x i Wi , x 2 ) = Cafe , Wi) + gi ( x ,). We shah show that C n (xi , Wl , x 2 ) 
may a ways be written as C n (xi , w x ) + a function of x 2 alone, and this is the im¬ 
portant step in verifying that Clark’s solution is optimal. 

tha4 ° rm 1 There ^ a S6qUenCe ° f functions with gi (x 2 ) = L(x 2 ), such 


(16) 


Cnfe , wi , x 2 ) = C n (xi , wi) + g n (x 2 ). 


Moreover it is optimal for installation 1 to provision without reference to in- 
stallation 2, subject to the proviso that if insufficient stock is available at installa- 
1 w 2 ’ ?f, n , inStallatl0n 1 wlU be con tent with getting as much as it can. 

(16) S* this theorem by induction. Let us suppose that Equation 
) for (n 1), and we shall then demonstrate its validity for n. Sub- 
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stituting in Equation (14), we obtain 

C n (x i, wi , x 2 ) = Min \c(z) + ci(y — Xi — w%) + L{x 2 ) 

xi+Wi ^v^ x 2 L 
O^z 

(17) + L(x i) + a f Cn-iixi + wx — t,y — xi — Widt 

Jq 

+ a jf 0»-i(z2 + z — t)4>(t) dfj. 

From Equation (17) we see that aside from the constraint that y be less than x 2 , 
the optimal selection of y is such as to minimize 

ci(y — xi — wi) + L(x i) + a I C n -i(xi + wi — t,y — xi — wi)4>(t) dt, 

Jq 

and this is, of course, the same as the single critical number x n for the problem 
of installation 1 considered separately. If it turns out that x 2 ^ x n , then the con¬ 
straint x 2 ^ y is not operative, and we may therefore conclude that for x 2 ^ x n , 


Cn(Xi,Wi,X2) = Cn(Xi,Wi) 


(18) 


+ Min lc(z) + E(x 2 ) + a [ g n -i(x 2 + z — t)<f>(t) dk . 

z >0 ( JO ) 


On the other hand, if x 2 < x n (and therefore xi + wi < x n ), installation 1 will 
be thwarted in its attempt to bring its stock level up to x n . Because of the con¬ 
vexity of the one-period costs, it is optimal to bring the stock level up as high as 
possible and therefore y= x 2 . As a consequence, we see that for x 2 < x n , 

Cn(,X 1 , Wi , 3/2) ==: Ci(x 2 3/1 I^l) “I” L(.Xi) 


(19) 


+ a / Cn- lixi + Wx — t, x 2 — Xi — W!)<t>(t) dt 
Jo 


+ Min \c(z) + L(xz) + a f g n -i(xi + z — t)<l>(t) dt\. 
• go ( Jo J 

Now we are interested in showing that C n (x i, Wi, x 2 ) — C n (xi, wi) is a func¬ 
tion of x 2 alone. From Equations (18) and (19), we see that this difference is 
equal to 

(20) An(xi, Wi, Xi) + Min \c(z) + L(x 2 ) + a f ffn-i(x 2 + z — t)4>(t) dfi, 

z>0 l JQ ) 


where 


An(xi, Wi, X 2 ) = Ci(z2 — Xi — Wi) + L(x i) 


( 21 ) 


-j- CL I Cn—l(>Xl *“f“ Wi — t, X 2 — Xl tl?i)<j(>(£) dt Cn (%lj ^l)> 

Jo 

when x 2 < x n and zero otherwise. 
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In order to demonstrate Theorem 1, it is therefore necessary to show that 
A.„(xi , Wi , Xi) is in reality a function of Xz alone, and of course we need only con¬ 
sider the region xz < x n . In this region, however, 

Cn(xi, Wi) = C^Xn — Xi — Wi) + L(x X ) 

( 22 ) 

+ a Cn-i(xi + Wi — t,Xn — Xi — Wi)<j>(t) dt, 

JQ 

and therefore Equation (21) may be written as 
A n (xi, wi, xz ) = Ci(xz — x n ) 


(23) + « / [Cn-i( X i + Wi — t,X2 — X 1 — Wi) 

Jo 

— ^^(^l + Wi — t, Xn — Xi — Wi)]<i>(t) dt. 

Our theorem will be demonstrated if we can show that the integrand in Equation 
(23) is independent of xi and wi. But by Equation (6), 

Cn-i(xi, wj) = L(xi) + a f L(x i + wi — y)<t>(y) dy + f n {x 1 + wO, 

JQ 


and therefore the integrand in Equation (23) is given by 
a f L(x 2 — t — y)<t>(y) dy +/ n - i(x 2 — t ) 

JO 

- a J - t ~ y)<t>(y) dy - /n-i(x„ - <), 


(24) 


which is a function of X 2 alone. We have therefore demonstrated Theorem 1. 

We have, however, demonstrated somewhat more. A n is now known to be a 
function of x 2 , which may be written as 

A n (z 2 ) = Ci(x2 — X n ) 

(25) + a 2 [L(xt - t- y) - L(x n - t - y)]<t>(t)<j>(y) dt dy 

+ « f [/»- i(xz — t) — fn-i(x n — t)]<f>(t ) dt, 
Jo 

for Xz < Xn and zero for xz > Xn. But Equation (20), which represents g n (xz), 
may be written as 

(26) gJ.Xz) = Min jc(z) + L(xz) + A(xz) + a J g n -i{xz + z — t)<t>(t) dt ^. 

The solution of this equation provides us with the optimal policy for the entire 
system. As we see, all that is required is to augment the natural costs at echelon 
2 by A(xz). 

Theorem 2. The functions g n (x 2 ) satisfy Equation (26), by means of which the 
optimal system stock may be obtained. 
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4. Several Installations with the Same Supplier 

In this section we shall generalize the model considered in Section III so as to 
include the possibility of several installations with the same supplier. We shall, 
however, retain the restriction that no installation has two different suppliers. 
All of the assumptions that have previously been mentioned, such as back- 
logging, no setup cost for transportation, etc., will be retained also in this model. 
The only point in need of clarification is our assumption that the natural losses 
should be functions of echelon stock, rather than installation stock. The notion 
of an echelon in this model will be as follows: We begin by selecting a specific in¬ 
stallation, say installation /. Associated with I will be a number of other installa¬ 
tions which receive, directly or indirectly, stock from installation I. The total 
stock at /, plus the stock in transit or on hand at these other installations will 
comprise the echelon associated with installation /. With this definition our as¬ 
sumption will again be that the natural one-period costs are associated with 
echelons, rather than installations. 

Let us consider the following example of such a complex: 

slU 

m 

-[ c/ ''M 

^ [B l \^[A 1 ] 

The procedure which we have shown in the previous section to be optimal for 
a simpler problem suggests the following procedure in this complex: 

(1) For installations Ai , A 2 , and A z (which are terminal installations) com¬ 
pute the optimal sequences of single critical numbers, assuming the in¬ 
stallations to be in isolation; also that all requests for shipment are sup¬ 
plied during the natural lead time, and that the purchase cost is given by 
the transportation cost from the higher echelon. 

(2) Augment the natural costs at echelon B\ by the increment in cost at A± 
because of the inability to satisfy requests for stock at Ai , and do the 
same for B 2 . Then compute the optimal stock levels at B\ and B 2 sepa¬ 
rately, assuming the availability of infinite stock from Ci . 

(3) Modify the natural costs at Ci by the increment in cost at Bx and B 2 
because of the inability to satisfy requests for stock, and then compute 
the optimal policy at Ci . 

If the directions given above are examined closely, it may be seen that they 
are ambiguous on a number of points. The clarification of these points will show 
where the Clark procedure departs from optimality in this model, whereas it was 
optimal for the model considered in Section III. Even though the procedure is 
not optimal, it has considerable merit, both in its ease of application and in its 
approximate validity. 

Point 1. Shall we permit an arbitrary pair of installations to exchange stock; 
and if so, at what cost, and with what lags? 

As an example, we are posing the question as to whether A 2 shall be permitted 



456 


IX-35— STOCHASTIC DECISION MODELS 


to ship excess stock to A, . The desire to make this shipment might arise in two 
different ways First of all, there may be insufficient stock at echelon B 2 to raise 
both A 2 and ^4 3 to their required critical levels, and the stocks left over at A 2 
and ^1 3 may be out of balance by a sufficient amount so that it is wise to ship 
both and 4, stock to A, . Another possible cause of transshipment might be 
a substantial anticipated drop in demand at A 2 and an excess of carryover stock 
which might profitably be shipped to A s . 

In practice, however, transshipment of this sort would rarely take place. More¬ 
over, if we permit this sort of transshipment to take place, the theoretical and 
computational aspects of the problem become quite complex. It would be mean¬ 
ingless for an installation to consider itself in isolation, inasmuch as its actual 
stock levels in the future would depend on the disposition of stock at all other 
installations. Since our primary aim is to be able to compute optimal policies at 
each installation separately, we shall assume that such transshipment is impos¬ 
sible. It is gratifying that such an assumption does not run contrary to what is 
done in practice. 

Point 2. If all requests cannot be satisfied because of insufficient stock at a 
ffistefiations? n ’ ^ ^ available stock to be rationed among the requesting 

The answer to this question bears very heavily on the optimality of the pro¬ 
cedure suggested above. We shall consider the following concrete case: 

[AJ 

S' 

~*IB] 

\ 

[A 2 ] 

and assume, for definiteness, that all routes have a time lag of one period, and 
that the transportation cost c x is the same from B to A 1 as from B to A 2 . Let 
or represen \, the one -Penod costs at installations 1 and 2, respectively, 

onii , m 7T C ° St ^ R Ut C * iXl) and ^ re P**ent the minimum 
TSnZ v n 1 2 T PU * ed fP aratel y> and let {&} and {^} be the sequence 


(27) C n \ Xl ) - Mm jc^ - *,) + L\x x ) + a f" cLiiy, - f)*(f) , 

and similarly for C n r . 

cosUf A K °,! S r t T 1 111 ’ We de&le C * (Xl ’ X * ’ *•) t0 be the minimum system 

tTons siiiv . f ! 8 ’ 2 , aSX2 . UnitS ’ and the B echelon bas units. These func- 
t ons satisfy a functional equation analogous to Equation (17), i.e., 


Cn(x x , X2, xz) = Min|c(2:) -(- c x (y x — Xl ) 


+ ci(y 3 - xt) + L{x 3 ) + L 1 (x x ) + L\x 2 ) 


+ « // CU(* - h, 2/2 - <2, z 3 + 2 - tl - 4 )<k(ti)<t> 3 (t 3 ) dt x dJ , 


(28) 
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where the minimization is over the region 2/1 ^ £1 , y 2 ^ x 2 , x z ^ y\ + y 2 , 

The crucial ideas behind the optimality results of the previous section were 
embodied in Theorems 1 and 2. The analogue of Theorem 1 for the model con¬ 
sidered in this section would be that there exists a sequence of functions g n (x z ) 
with the property that 

(29) Cn(Xl , £2 , £3) = C n l {x 1 ) + C n 2 (x 2 ) + Qn{x Z ). 

Does there exist such a sequence of functions? And if there does, what light 
is cast upon the question of allocation of stock (Point 2)? 

Unfortunately the answer to the first of these questions is in the negative. 
The functions C n (x 1 , x% , x z ) cannot be broken down in the form of Equation (29). 
To see why this is so, let us assume that Equation (29) is valid for n — 1, and 
see what the consequences of Equation (28) and this assumption would be for 
C n (x 1 , £2 , £3). We would have 

C n (x 1 , £2, £3) = Min \c(z) + Ci(yi + 2/2 — £1 — £2) 


( 30 ) 


+ L(x 3 ) + L x (x 1 ) + L 2 (x 2 ) + a f Cn- 1 ( 2/1 — ti)<f>i(ti) dti 

Jo 

OL [ Cn— 1 ( 2/2 — 4)02(4) dt 2 
Jo 

+ a jj g n -i(x z + z — ti —- 4)0i(4)02(4) dti dt 2 .! 


Aside from the constraint that 2/1 + 2/2 be less than x z , the optimal selection of 
yi would be x n l and the optimal selection of y 2 would be x n 2 . If £3 > x n l + x r 2 , 
the constraint is not operative, and from Equation (30) we would have 

C n {xi , £2, £3) = C n \x 1 ) + C n 2 (£ 2 ) 

(31) 

+ Mm 

z^O 

So far, so good. We run into a problem, however, when x 3 < x n x + x„ 2 . This 
is, of course, the problem raised by Point 2, and the answer is given by Equation 
(30). The numbers y± and y 2 should be selected according to the constraints 


c(z) + L(xz) + a JJ g n -i(%z + z — <1 — 4)<£i(4)<fo(4) dti 


dU 




(32) j/i + y t = *3, yi ^ xi , yi ^ x 2 

and such as to minimize 

Ciiyi + 2/2 — Zi — 3 2 ) + + L z (a: 2 ) 

+ a j Ci-i(yi — ti)<t>i(ti) dti + a J C*- 1 ( 2/2 ~ ti)4> 2 {M) dti. 

Therefore, in order to allocate properly we must solve the minimization 
problem (33) subject to the constraints (32). The problem is certainly solvable. 
The difficulty, however, is in the form of the answer. The answer may depend 
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not only on x 3 , but also on a* and x 2 (the stock at Ai and At). It may depend 
on a* and % ; generally it will not unless the stock levels Xl and x 2 are seriously 
out o alance. But if the solution to the minimization problem does depend on 
and x 2 , it will depend on them jointly and a factorization of the type given by 
Equation (29) would not be obtained. 

Let us assume that such a lack of balance does not occur. Then Vl and y 2 would 
be selected to minimize (33) subject to 

2/i + 2/2 = x z , alone. 

Call these solutions x n \x z ) and x^(x 3 ). Then Equation (30) would read 

Cn(Xl, X 2 , Xz) = Cn(xi) + Cn(x 2 ) + A„(Xl, Xz, Xz) 

(35) f ~ rr ^ 

+ Mm |c(z) + L(xz) + a JJ g n ^(x 3 + z~k~ k)<j>i(k)Uk) dk dk \, 

where 

A n (xi, X 2 ,Xz) = Ciix^ixz) + x n 2 (xz) — Xi — x 2 ) + L l (x 2 ) + L 2 (x 2 ) 

(36) +a l c --i^n{xz) - k)<h(k) dk - C n \ Xl ) 

r 

+ “ jf # CLi(x„ 2 (x 3 ) - k)<t>z(k) dk - Cn(.x 2 ), 

iovxz < Xrt + Xr 2 and zero, otherwise. However, just as in Section III, if xi < xA 
and x 2 < x„ 2 , this may be shown to be a function of x 3 alone, and this is the func¬ 
tion that is to be taken to augment the natural costs at echelon B. 

We repeat that Equations (35) and (36) are derivable only by means of the 
assumptions that the stock at installations Ai and A 2 are not out of balance. 
Since this is expected to occur rather frequently, it suggests that Clark’s ap¬ 
proximation is an excellent one for this model. 

5. Extensions 

The discussion m Sections III and IV assumed that demand originates in the 
^stem at the lowest installation (echelon 1) and at no other point in the system, 
lins however is not a necessary assumption and, in fact, the probability dis- 

?i7 Utl ™- USed f ° r the vanous echelons need have no relationship with each 
other. Tins may be demonstrated by considering the proof of optimality in Sec- 
tion ill tor the simple two-echelon example. 

If /i(«i) and / 2 (< 2 ) represent, respectively, the marginal demand distribution at 
echelon 1 and echelon 2, and f(k , t 2 ) the joint distribution, then Equation (14) 
may be rewritten as follows: H v ; 


C n {x u Wl , xz) = ii+ Min s Jc( 2 ) + Cl (y - a* - u*) + Z{x 2 ) + L( Xl ) 

Ogz 


+ a l l Cn ~ l( - Xl + Wl ~ tl ’y-^-Wi,x 2 + z- k)f(k , k) dkdtX . 


(37) 
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Following through the proof of Theorem 1 by substituting 
C n (xi , Wi , x 2 ) = C n (x 1 , wO + g n (x 2 ) 
in Equation (37), we obtain 

C n (xi, wi y x 2 ) = Min <c(z) + Ci(y — x x — Wi) + L(x 2 ) 

xi+wi£y£z 2 ( 

0^2 

(38) + L(x i) + a f C n -\{x i -4- t4?i — < 1 , 2 / — — Wi)/i(0 dti 

Jo 

+ J 9n-l(x 2 + z — fe)/s(fe) > 

which is the same form as Equation (17). The remainder of the proof is the same 
as in Section III. 

The ability to assign different distributions to the various echelons has several 
interesting applications. For example, the iV-installation problem of Section III 
may be interpreted as N stages of production, where the time required for pro¬ 
duction in each stage is analogous to the delivery times in the inventory problem. 
The final stage of production (analogous to installation 1 in the inventory 
problem) is faced with an exogenous demand while each production stage may 
incur random losses through spoilage. The probability distribution used for the 
final production stage is the exogenous demand distribution augmented by 
losses during the stage. This distribution is successively augmented by losses in 
the other production stages to obtain distributions for these stages. The per unit 
ordering cost for each stage is the fabrication cost in the immediately prior stage. 
This example represents the case when the mean demand is an increasing func- 
ion of the echelon number, i.e., the higher the echelon, the higher the mean. 

An example of the opposite case is encountered in the inventory problem where 
items are regenerated through repair. Considering the problem of Section III 
again, suppose that items issued from installation 1 are exchanged for damaged 
items (on a one for one basis) which then undergo repair cycles of different dura¬ 
tions according to the degree of damage. Thus, if t items are issued, then t rep¬ 
arable items are generated, with different portions, £ 1 ,^ 2 , * * • (t = fe) being 
successively more remote, timewise, from being available for reissue. Here, the 
net demand faced by echelon k is given by t — XX* U which is a decreasing func¬ 
tion of k. If, throughout the repair cycle, items are scrapped as being uneconom- 
ically reparable, then the mean demand as a function of echelon number may be 
more general than the monotonically increasing or decreasing functions con¬ 
sidered above. 

Problems of the type described in Sections III and IV, together with the inter¬ 
pretations mentioned above, may be combined to portray almost any inventory 
and/or production structure. Such combinations may be used to make supply 
repair, and production decisions in an integrated fashion. Of course, in each ap¬ 
plication, the assumptions of the method must be analyzed with respect to their 
validity or effect. 
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A REDISTRIBUTION MODEL WITH SET-UP CHARGE* 1 

S. G. ALLEN 

Stanford Research Institute 

This paper considers the problem of redistributing stock among several 
user activities within the period between regular deliveries of new supplies 
to the system. The cost of redistribution is assumed to be proportional to the 
number of shipments among the activities. A procedure based on minimizing 
total redistribution and shortage costs within the period is given for determin¬ 
ing the amounts (if any) to be shipped among activities. 

Introduction 

Suppose an item of material is carried in stock at each of several user activities. 
It is assumed that periodically the item is procured for the entire system of 
activities from an outside source of supply and delivered after a fixed time to 
individual activities to replenish their stock. Because of random usage at each 
activity, the stock at some of them may become insufficient to protect against 
shortages which may occur before the next delivery of new supplies to the 
system. Therefore at each regular review within a delivery cycle, it may be 
desirable to redistribute stock among activities. However it is assumed that this 
will involve a cost proportional to the number of shipments made. 

The procedure developed in this paper provides criteria which enable a central 
inventory manager to determine whether any activity has excess or deficit 
stock for the period before the next scheduled delivery. It also permits him to 
determine amounts to be redistributed among activities to rectify the imbalance. 
These criteria are based on an objective of minimizing the cost of the redistri¬ 
bution and the shortage costs which may be incurred by activities within the 
period until the next delivery. 

Neither the alternative allocations of new procurement to activities at the 
end of this period nor the redistributions possible at subsequent reviews within 
this period are considered. This simplification may nevertheless provide useful 
results if existing procurement and allocation policies can be relied upon to 
correct basic stock inadequacies over the long term and if any redistribution 
between scheduled deliveries will render subsequent redistributions within the 
same period unnecessary. 

Background of the Problem 

Reference 1 described a redistribution model which was essentially of the 
following form: The decision variables were the non-negative quantities Xij 

* Received May 1961. 

1 This paper represents work done in connection with Stanford Research Institute’s 
contract with the Bureau of Supplies and Accounts of the U. S. Navy. An earlier version 
of this paper was given before the Nineteenth National Meeting of the Operations Research 
Society of America, Chicago, May 25 and 26, 1961. 
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to be currently and instantaneously distributed from the ith activity to the jth 
activity, (i, j = 1, ■• • , N), where N was the total number of activities. The 
shortage penalty for a single activity to be associated with the redistributions 
Xij was the expected number of shortages as of some future point of time (e.g., 
the time at which a future allocation was to be delivered into the system)! 
Expected shortage at the ith activity was defined as 


+5*) 


dFdy) 


where 


Si 0 = stock on hand at i before the redistribution is effected; 

Fi — the probability distribution of accumulated demand at i from the 
present to the future point of time in question. 

The only cost parameter included in the model was c*y, the ratio of the unit 
cost of transportation between the zth and jth activities to a common unit cost 
of shortage at these activities. 

The total system cost to be minimized by the redistribution was then 


*'=1 i~l j=l 


Theorem I of Ref. 1 completely characterized the solution; nam ely- 


For i,j = 1, ■ ■ ■ , N, either 


(3) 


■V+Sxy-2-x jk 


dFj - 





or x ti = 0. J 

In words, a positive redistribution between any two activities equates the 
difference between the shortage probability of the receiving activity and that 
of the shipping activity to the (relative) unit cost of transportation. 

n p SU j ISe<1Uent “ lterest in memorandum is the case where c tJ - = 0 and 
the^,- denote normal distributions with mean and variance cf. Whenever 
Cij- 0 for all i, j, redistribution equates the shortage probabilities of all ac- 
ivi les o a common value. In addition, when the normal assumption is made, 

property (3) requires only that the total amount received or shipped by an 
activity is: 

Hxu = Viiti - 0 , 

j=l 

in the case that activity i is a shipper, or 

N 

^ 5 ^ = aj{t - t s ) 

*'=] 
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in the case that activity j is a receiver, where 

(6) U = {Si — tii)/oi, i = l,---,N, 


and 

(7) 


t = 



V£ 


denote the normalized stock positions of activities before and after redistribu¬ 
tion, respectively. 

Unfortunately, after a redistribution satisfying these conditions has been 
made, the stock positions of activities would most certainly depart from the 
criteria stated. Thus, redistributions would be required among the same ac¬ 
tivities every time the criteria were applied. A possible remedy for this situation 
is the introduction of set-up charges into the model. In general, set-up charges 
would tend to deter frequent redistributions of small amounts among activities. 


Assumptions of the Set-up Charge Model 

Specifically, it will be assumed that a single set-up charge K is incurred for 
each shipment between a pair of activities. As in the case of unit transportation 
cost, K must actually be the ratio of the set-up charge to the unit shortage cost. 

Because of inherent difficulties associated with a model including set-up 
charges, a redistribution with a set-up charge is considered here under the 
simplest of the transportation cost assumptions previously studied; namely, 
zero unit transportation cost. Thus, total system cost of redistribution becomes 

(8) + K X] 

where 

, . = /0 if z = 0 
— 1l otherwise. 


A normal distribution of demand at each activity will also be assumed. Finally, 
to avoid trivial considerations in the discussion to follow, it will be assumed that 
for no pair ( i , j) will a = 07 and U = tj . 

The Class of Optimal Redistributions 

Suppose {x*j) minimizes G in (8). Then clearly the x% which are positive 
must still satisfy the characterization given in ( 4 ) through ( 7 ), where the sums 
in the last expression include only those activities for which x% is positive. 
Otherwise, infinitesimal changes in the positive z*- could reduce X Gt in (8) 
with no increase in the amount of set-up charges. In effect, if the integers i and 
j for which x% > 0 were somehow known, a minimizing solution would be com¬ 
pletely determined at this point in terms of the <nid Xij. 

In any redistribution where there are exactly m integers ii , • • • , i m for which 
x^ > 0, and exactly n (different) integers^ for which xa > 0, 
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it will always be possible to accomplish this redistribution with no more than 

m n ~ ^ °f Xi i’ s positive. The smallest number possible is, of course, 
max (m, n). 

To avoid computational difficulties, which appear to be of little practical 
significance, it will henceforth be assumed that any redistribution with x a 
and 2*< x n positive only for if , - • • , i m and ji, ■ ■ ■ , j n , respectively, requires 
exactly i 7 in 1 values of a?*/ that are positive. For, if less than this number 
occurred m redistributions satisfying expressions (4) through (7), as modified, 
then these expressions imply that there exist proper subsets I of the integers 
> ' • • , in , and J of the integers ji, • • • , j n , so that 

(9) g (S.° - mi - = g (°ft - S ,° + mi) 

where 


( 10 ) 


g (Si k — Mii) + 2 S% — 

K=L _k=l 

n 

g *ik + g <Tj k 


In other words, the initial stockpiles of the activities, i.e., the 5 f °, must satisfy 
rather special linear dependencies. This is, perhaps, an unlikely state of affairs. 

Consider the set of all redistributions {x iS } in which the x iS satisfy property 
( 4 ) for i belonging to a set of integers {h , • • ■ , ».}, which satisfy property ( 5 ) 
or j ongmg to a (disjoint) set fai , • • • , j n \, and which are zero otherwise: 
where m =1, n^ 1, and m + n ^ N; where t is given by (10); and where 
U is defined by (6) for all i. In view of the above assumption about the required 
number of set-ups, G may be written for such x iS as 

(U) 8 - ± ft + K [g {g *,„) + g «(g x.,-,)] - K. 


i.e., the total ship- 


G is now dependent only on the sums and fo¬ 

ments from or to an activity. 

Note also that in (11) the G { terms are summed over all i. The relevant system 
cost of a redistribution which excludes a given activity as either shipper or 
receiver must, nevertheless, include the shortage penalty for that activity, 
in order that the cost of this redistribution may be compared with one that 
includes that activity. 

In fact, since G { simplifies to <r*(*) for any i, where * denotes the normalized 
stock position of activity i after redistribution is effected, and where 


02) j( 2 )-/‘(£^i> 

J * V2^ 

G may be written entirely in terms of the U variables and variables y t defined by 


(13) 


Vi 


if * € {ii, ••• ,t,,ji, ••• ,j„} 
otherwise; 
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namely, 


N 


G = G(y) = X (1 - 2/iVi 9(ti) + 9lt(y )] E Vi <n + K E Vi — K 


1=1 

N 


(14) 


= E * 0(0 - K + £ Vi * ^ g[«(2/)l - E 

*=1 ZJi °i 


t=1 


•>>-!]} 


where 

N 


y ^ Vi &i 

(15) 

t(y) = ^-• 


E yi <n 

i=1 


If a positive redistribution is optimal, the problem becomes that of minimizing 
G with respect to a set of integers i for which y { = 1, i,e., a set of activities which 
actually ship or receive material in redistribution. While the computation of 
G could be made very quickly for any specified combination of activities, the 


total number which must be considered, namely, 



, is obviously large 


for large N. In addition, note that no redistribution is always a potential candidate 
for the optimal redistribution. This particular redistribution, i.e., yi = 0 for all 
i, will subsequently be denoted by the symbol co. The “combinations” involving 
only one activity in redistribution are ruled out of course. 


A Related Problem 

If the domain of definition of G(y) in (14) is extended over the set 


Y = {y:y = 0/i, ,y*), 0 g Vi g 1}, 

then a formally related (but not equivalent) minimizing problem to that of 
finding the optimal redistribution may be considered. The related problem 
can be subjected to a more complete analysis than the combinatorial one that 
is actually presented by redistribution. And furthermore, the minimization of 
the related problem turns up a solution that indicates a close approximation 
to a redistribution solution. 

It is clear that G is differentiable at every point of Y except a (where, never¬ 
theless, it is continuous). Also it can be shown that 
Theorem I. G is convex over F. 

Proof: In view of the form of G as written in (14), all that need be shown is 
that for y 6 Y, y* 6 Y and y 9 ^ y*, and for a such that 0 ^ cl S 1, 

g{t[ay + (1 — a)y*]}[ot]L,yi<ri + (1 — yfvi] 

£ <*g[i(y)]Z yin + (i - 


(16) 
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or, more explicitly, 


(17) 


9 <£> + (i - «)2>v- J [a ^ Vi + a) ^ yi 


4 *(%£)*"«+ 

^ e *Lf ~ ^ U(yi<ri/'!£jyi (n), z* = (yfai/^yi* ai), and 0 = a( T^Vi <n)/ 
[<*2jy% v< + (1 — a)±lyi* Vi]. Then (17) is equivalent to 


( 18 ) g\fiz + (1 - / 3 )**] S Pg(z) + (1 - p) g (z*) 

which, because 0 S P ^ 1, is true from the convexity of g as defined in expres¬ 
sion (12). 

With these properties of G(y) established, then, as a special case of Theorem 
I of reference 1, it follows that if a point y ^ co minimizes G over Y, such a 
point is completely characterized by the property: 

For every i, either 


or 

(19) 


where 


dG 


dyi 


Vi = 


0 and > 0 
dyi 

1 and ~ < 0, 
dyi 


(20) dy t = y ~ 9(ti) + [k - t(y)]g'[t(y)}} + K. 

The necessary and sufficient condition that G(o>) > min G(y) is straight¬ 
forward. The expression for G as given in (14) is composed of a component 
Z^<ng{U) - K, which is G(u), and a component dependent on y, namely 

<21 > &.,){#«!-E^ 

Therefore G(w) > min G(y) if and only if there exists a y ^ co such that ex¬ 
pression (21) is negative. The existence of such a point y is equivalent to the 
existence of oa , • • • , a „ so that 0 g a t ^ 1, £ «< = 1, 0 < Ui < 1 for at least 
one i, and so that 


( 22 ) 


g(Hoa U ) < £«< g(ti) - - . 

Vi_ 


! The necessit y for this follo ^ s from the fact that if a y not equal to « minimizes G, then 
expression (21) is identical to J^y.idG/dyi), which by property (19) is non-positive Suffi- 
ciency is obvious. 
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This in turn is equivalent to the existence of at least two integers, say h and i 2 , 
between 1 and N, and an a,-,, where 0 < a;, < 1, such that (22) holds for 

This is obviously a stronger requirement than the (strict) convexity of the 
function g and of course relates to the initial inventory positions of the several 
activities. Graphically it would be quite easy to see if the requirement is met: 
the curve described by the points {t } g(t)} must intersect the convex hull of the 
points {ti , g(U) — ( K/<n )}. 

These results might be summarized as 

Theorem II. A point y ^ minimizes G over Y if and only if 

(i) There exists a triple (i\ , where 0 < < 1, so that (22) holds 

for a i2 = 1 — otii , and 

(ii) For every i, (19) holds. 

The above theorem effectively removes a possibly troublesome matter in the 
definition of G as given by (14). G(y) assumes the value X>$(^) - K at 
y — co, whereas the original expression (8) assumes, appropriately, the value 
(*,•)• Theorem II demonstrates that if a point y 9* w exists satisfying 
property (19) for each of its coordinates, then necessarily G(y) < 

Moreover, Theorem III below makes it possible to ignore how G is defined over 
the region {y: 0 < X^A' = 1}* 

This characterization given by Theorem II of the solution to the problem of 
minimizing G over Y would be a satisfactory resolution of the redistribution 
problem were it not for the possibility that the minimizing point may have a 
coordinate yi such that 0 < yi < 1. As previously mentioned meaningful re¬ 
distributions can only be associated with certain vertices of Y } in particular 
the set of points 

Y = \y: y € Y, y,- = 0 or 1, & ^ 1]. 

Clearly the min r G(y) need not be assumed at such points. 

However, situations in which more than one of the y% lie between zero and 
one in the optimal solution would appear to be unlikely. In the first place, the 
value t(y*), where G(y*) = min r G(y), is unique. Otherwise, if there exists a 
y’ t* y* such that G(y') = G(y*) and t(y*) ^ i(y'), then Theorem I requires 
that 

(23) G\fiy* + (1 - 0)y'] = 0G(y*) + (1 - P)G(y') = min r G(y) 

for all (8 such that 0 ^ 0 ^ 1. But note in the proof of Theorem I that if z ^ z* 
in expression (18), then that inequality and hence the inequality in expression 
(16) must be strict. Therefore, with t(y’) and t(y*) substituted for z and z*, 
respectively, in these expressions, a contradiction is found to equation (23). 
Evidently t(y*) = 

Therefore expression (20), which for each i is a function of t(y) alone, must 
have a zero at the unique value t(y*) for more than one i. Again, here is a situ¬ 
ation that would require rather special relationships among the initial stock¬ 
piles at activities. 
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Furthermore, one can always be assured that 

Theorem III. There exists a y f such that G(y') = min Y G(y) and 0 < y'i < 1 
for at most one i. 

Proof: Note that expression (20) for each i varies only with the value of t(y) 
Therefore if y* minimizes G(y) and if P = {i: 0 < y* < 1} has more than one 
element, then it is only necessary to exhibit a y' yielding t(y r ) = where 

^ k ~~ m y* ^ or a l l k ^ P and for at most one k £ P. Since expression (20) for 
each % must thereby have the same values for y ' as for y*, it follows that y f 
must also minimize G(y). 

® ~ ~~ ^ <*iU »^2 = ^C»e<2 > and t A = ( y^ icA <TiU)/ 

\2~,i£A<ri), when A is non-null. Suppose, for instance, K x S Ki(y *). (The 
case of equality is worth consideration only in the event Q is null, in which case 
Ki and K 2 should be taken to be zero in the following; if equality holds and 

> 0, then the desired y' is already at hand.) The set P* = {%:i £ P y t- > 

<(2/*)} is evidently non-null, and, since t(y*) is essentially a weighted average 
of the U , it follows that 


Ki+ Y! yi**i u 

^ *(v*) 

K 2 + Vi O’i 


or, equivalently, 
But 

and therefore 


K i + .Z - t(y*)] S K4(y*). 
Y, <n[U - i(y*)] > Z - t(y*)], 

i£P m 


r +Q > t(y*). 

Because of this, there must exist a smallest set P' c P* such that t r +Q > t(y *); 
i- e ; a smallest set in terms of the number of its elements. ’ 

J * u f ° ne element is excluded from P' (call the reduced set P"), then 
t S t(y*). Hence, there exists a y k such that t 6 P',0g j/ t g 1, and 

^1 + Z Vi U + y k <Tk 4 
~k + t 4- -- *<»*>• 

rv 2 + 2^ <r> + Vk <r k 

i £ p// 

The desired y’ may then be defined by y\- = 1 if f <E P” + Q, y’ k = Vk an d 
y i — 0 otherwise. ’ 

A simple rephrasing of the argument handles the case where > l(y*)K t 
The above theorem insures that if G(y*) = min r G(y) < G( co), then > 


thP * ^ 14 beei \ shown that a hyperplane known to have an intersection with 

t- pf tT *° nal cube must have a point in common with one of its edges. The construc- 
tion of the above proof gave a useful insight into the problem of finding y*. It has been 
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1. It also affirms the uniqueness of the minimizing solution in the case that ex¬ 
pression (20), evaluated at is zero for at most one i. 


A Computing Procedure 

In this discussion it will be assumed that min y G(y) < G(co). 

Expression (20) set equal to zero may be considered as a condition on t = t(y) 
and rewritten as 


(24) 

where 


<r t tf(Z) + UF(t) - f(U) - UF{U)] + K = 0 


m = 


—(1/2)t 2 

e 

\/2t 


and 

F{t) = [ f(w) dw. 

J—co 

The left side of (24) is a strictly increasing function of t on the interval (“°°, U) 
and strictly decreasing on {U , ). Depending on its value at rfc 00 , it may have 

no roots, one root, or two roots. If y* minimizing G over F does have a coordinate 
between zero and one, then among the collection of roots for equation (24) 
for i = 1, • * * , N, is one which is actually assumed by t{y) for y satisfying 
property (19) and with only one yi between zero and one. 

If this does not occur, i.e., the minimum of G over Y occurs at a vertex of 
F, then the simple behavior of the left-hand side of (24) and knowledge of the 
roots should be suggestive of the appropriate combination of activities to produce 
a t{y) satisfying property (19). (Incidentally, this vertex which mini m izes G 
over Y must be unique; otherwise the convexity of G must permit a y with at 
least one coordinate between zero and one also to minimize G over F.) Also 
available is a method like the simplex corrected gradient procedure described in 
Ref. 2. (See Section 6 of that article for the application to concave program¬ 
ming.) If the minimum of G occurs at a vertex, this procedure will indeed termi¬ 
nate. 

Of course any procedure for extremizing convex or concave functions can be 
applied to the present minimizing problem. The advantage of making the analysis 
described in this paper lies in the possible avoidance either of examining a large 
number of activity combinations for redistribution purposes or of using an ex¬ 
tensive iterative procedure which might be required in a more general purpose 
computing scheme. 

It will be remembered that when minr G(y) < min? G(y ), property (19) 
does not characterize the solution of the “real” redistribution problem. One 
might think that in this case the closest vertex y to the y* minimizing G over F 
would be a good approximation to the real solution; i.e., choose y such that 

_ _ fO if y%* S | 

^"li if yi* > I 
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as a solution to the redistribution problem. In all numerical examples tried by 
the author, this has yielded a value of G very close to min? G(y). But an even 
better solution 5 ’ has been a y £ Y which is closest to the hyperplane 

J2<Ti [U - t(y*)]yi = 0. 

. In the followin g numerical example a value K = .25 was used in the calcula¬ 
tions: 


i 

k 


yf 

n 

1 

-1.0 

3. 

1. 

1. 

2 

-0.5 

3. 

1. 

1. 

3 

0.0 

4. 

0 

0 

4 

0.5 

4. 

0 

0 

5 

1.0 

1. 

0 

0 

6 

1.5 

2. 

0.2323 

0 

7 

2.0 

1. 

1. 

1. 

8 

2.5 

3. 

1. 

1. 


0 

0 

0 

0 

0 

0 

0 

0 


y 

G(y ) 


y* 

5.2531 

0.5444 

y 

5.2577 

0.5000 

y 

5.2540 

0.5455 

0} 

7.6371 

— 
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DEVELOPMENT AND EVALUATION OF SURVEILLANCE 

SAMPLING PLANS*f 

C. DERMAN and H. SOLOMON 

Columbia University 

The problem of maintaining quality of inventory in the presence of deteriora¬ 
tion is studied. Repeated application of sampling inspections together with 
replacement policies are used to maintain quality. The effect of such repeated 
applications can be measured in terms of the proportion of poor product exist¬ 
ing at any time. In this paper, the sampling plans studied are those commonly 
used in acceptance sampling, and the replacement policy consists in replacing 
lots judged defective by the sampling procedure. The theory of Markov proc¬ 
esses is used to evaluate the effectiveness of the sampling plans and replace¬ 
ment policy. As illustrations, special cases are considered. Graphs are provided 
for these cases. 


1. Introduction 

Acceptance sampling plans have been traditionally used to evaluate the 
quality of manufactured items. They can also be used, by periodic application, 
to ascertain the quality of stored items. When an acceptance sampling plan is 
used for such a purpose, we shall refer to it as a surveillance sampling plan. We 
shall only consider manufactured items whose quality deteriorates over time. 
More specifically we desire to construct decision rules for surveillance inspec¬ 
tors such that a preassigned level of quality of product is available with high 
probability at all times. The operating characteristics of these rules will de¬ 
pend upon the deterioration rate of the quality of the item, the surveillance 
sampling plan employed, and the length of the surveillance period. In this paper 
we shall discuss surveillance sampling plans and their evaluation. Once evalua¬ 
tion is made possible questions of optimality can be studied. 

Briefly we shall consider the following type of problem. Suppose we have k 
different lots, each of size N, which are stored. At regular intervals these lots 
are inspected. If a lot passes inspection, it is kept on hand until the next in¬ 
spection; if it does not pass inspection, it is replaced by a new lot of acceptable 
quality. The only way that a lot can leave the system and be replaced is for it 
to be rejected in inspection. This situation will arise whenever the lots stored 
are to be used only in cases of emergency, as in the stockpiling of vital material. 
The quality of the lots can be expected to deteriorate with age. The aim of the 
surveillance plan is to maintain the quality of the k lots (Nk items) so that, 
for example, if an emergency arises, there will be on hand sufficient stock of 
good quality to meet the crisis. We shall show how to measure the effectiveness 
of a surveillance plan which accomplishes this purpose. Once we have an index 

* Received January 1958. 

f Research supported by Chemical Corps Engineering Command, U. S. Army, under 
Contract DA 18-108-cml-6125. 
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of effectiveness of a plan, a catalogue of plans can be constructed. A plan can 
then be available for selection which will meet some desired cost or other rele¬ 
vant criteria. 

Specifically, we shall assume that the surveillance procedures to be evaluated 
are given by standard acceptance sampling plans, for example, those used by 
government agencies in their procurement programs, where Lj(p)(j = 1, • • • ) 
denotes the probability of accepting a lot when the proportion of defective 
items in the lot is given by p;j denotes the number of periods the lot has already 
been on hand. That is, if it is known that a lot has been on hand for j periods, 
sampling plan j which has operating characteristic (OC) curve Lj(p) is used to 
accept or reject the lot for further storage. Thus, at least in our general formu¬ 
lation, we allow the possibility of adjusting the surveillance plan according to 
the age of the lot. 

We shall assume the existence of a deterioration function p(t) which denotes 
the proportion of defective items in a lot which has been on hand for t units 
of time. Thus p(0) denotes the incoming quality of a lot. We are implicitly as¬ 
suming (i) that the lot sizes are large enough so that p(t) can be considered 
independent of the particular lot and (ii) that the manufacturer’s process is 
either under control at level p(0) or the initial inspection plan for new lots is 
such that the incoming level is p(0). 

One way to evaluate the effectiveness of a surveillance plan (i.e., the set of 
sampling plans characterized by {£,}) is to determine the proportion of de¬ 
fective items on hand at any given time. This will be not only a function of 
|I/y} but also of p(t). We shall show how to evaluate this proportion under the 
assumption that the surveillance plan has been in operation for a long time 
that is, assuming “steady-state” conditions. We shall also consider two special 
cases: one where p(t) is a step-function with one jump, the other where p(t) 
has an exponential form. In both cases we assume L,■ is independent of j, that 
is, only one acceptance plan is used throughout surveillance. 

A truncated model is also discussed and evaluated. Truncation occurs when 
a surveillance policy dictates that lots on hand for a preassigned length of time 
are automatically replaced by new lots. 

This paper represents an initial attempt at the development and evaluation 
of sampling plans for the surveillance function in inventory management. Some 
questions are resolved but, of course, many are posed. It is hoped that several 

problems raised in this paper will receive the attention of other workers in the 
field. 


2. Lot Age Viewed as a Stochastic Process 

The age of the ith lot together with the deterioration function p(t) and the 
apphcation of the surveillance sampling plan generate a stochastic process 
{ »WK» — 1, • • • ) where X n (i) denotes the number of periods the ith lot 
has been on hand just after the nth inspection. This stochastic process is, in 
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particular, a Markov chain 1 with stationary transition probabilities 

Pi,i +1 = L j+ i[p(j + 1)] 

( 2 . 1 ) i = 0 , 1 , • • • - 

Vifi = 1 — L j+ i\p(J + 1 )] 

where pu+i is the probability that a lot on hand for j periods will survive 
(j 4 - l) periods, and p,-.o is the probability that a lot on hand for j periods will 
be discsirdfid 

If X n (i) = j, this Tnpfl.ns that the proportion of defectives in the ith lot, t units 
of time after the nth inspection, is p(j + r). For the remainder of this paper 
we shall assume 0 ^ r < 1 , i.e., the periods between inspection represent one 
unit of time. Let Y n , r (i) = pU + r) when X n (i) = j. Then the proportion of 
defective items in the entire population r units of time after the nth inspection 

iS i * ' 

(2.2) Pn,r = Y„, T (i)- 

We aha.il assume that k, the number of lots, is large so that by the law of large 
numbers k 

(2.3) P.,r ~ JS(P«.r) = J £ £ P0* + r)P[X n (i) = jl 

1 C »=1 j =0 


3. Steady State Conditions 

If the surveillance procedure has been in effect for a long time E(P n , r ) can 
be approximated by 

(31) 7T r = lim E(Pn,r) ■ 

Since the 7’s are bounded random variables 

(3 2) t t = ~ £ £ p(j + r ) lim HZ„(t) = jl 

Under reasonable conditions on p(t) and Lj (for example, if for some T Lj[p(j)] 
is bounded away from 1 for all j > T) it is known from the theory of Markov 

chains that 

(3 3) Vj = lim P[Xn(i) = j] 

is independent of i and can be obtained by solving the system of equations 
o/+i = ViPi.i+i j = 0, 1, - • 


(3.4) 


12 Vj — i 

7=0 


subject to the conditions that Vj > 0 for all j. The a/s are called “steady-state” 
probabilities. If P[Z„(i) = j] = , then X n (i) is a stationary stochastic process 

i For an exposition on Markov chains, the reader is referred to An Introduction to Prob¬ 
ability Theory and its Application, W. Feller, John Wiley & Sons, 2nd edition, 1957. 
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and P[X n {i) = j] = v . for all n andy = 0, 1, 
solutions ’ ? 


* In our notation we have as 


Vi = V B n L r+1 [p(r + 1)] 

r—0 

(3.5) , 

Vo = - - j ._ l -_ . 

1 + £ II l>T+i[p(r + 1)] 

_ Tr 7=*1 r==0 

We then have 


(3 - 6) ^ = EpO' + rK. 

j ’=0 

Thus, assuming “steady-state” conditions, that is, n large enough so that we 
can assume P[X„(i) - j] _ Vj , we have x r is the expected proportion of defec¬ 
tives existing in the Nk items r units after an inspection. 

4. Case 1: p{t) a Step -function 

In this case let us suppose that 


(4.1) 

and 


p(t) = p 0 


for t < T 


~ Pi (Pi > Po) 


t g T 


Lj = L. 

Study of this case can be justified on the basis that such a step function can be 
oug o as a first approximation to the more general case where the deteri- 
oration curve p(t) is an ogive; T corresponds to the value of t where p(t) has 
its maximum slope; p„ and p x correspond to the average values of p{t) for t < T 
and t ;> T, respectively. 

7—1 

(42) Vi = w ° Q L r+X [p(r + 1)] = v 0 L 3 (p 0 ) j < T 

vj = «o £ 1 r) (po) 171 (pi) j > y 

and ~ ’ 


(4.3) 


wo = 


1 


L ln+ \po) + L lrl (^)Lfo) 


1 - L(Po) ‘ 1 - L(p x ) 

Here [T] denotes the greatest integer less than T. Then we have for 

0 g r < (T - [T]) 


(4.4) 


m * 

T r = Po ^2 Vj + p x J) Vj 

7 -[Tj+l 

= VoL in (p 0 )L(p ] ) , 

1 - L{p x ) (pi Va) + Po 
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and for 0 < (T — [T]) S r < 1 
(4.5) 


[T ]—1 *> 

ir T = p 0 23 v 3 + pi 23 v 3 

3—0 i=[T] 




The expressions (4.4) and (4.5) can be used to evaluate a given surveillance 
plan. If p 0 , pi , and T are known, then (4.4) and (4.5) indicate the proportion 
of defectives in the entire population. However, it is in general unreasonable 
to expect pi and T to be known. If such is the case then a conservative evalua¬ 
tion (comparable to the AOQL in acceptance sampling) of the plan is given by 
the largest of the expressions (4.4) and (4.5) maximized over pi and T. 

If we consider the special case p?. = 0 and if L is continuous from the left 
at p = 1 then 

(4.6) ^^[TiVr 


Now in order to find max ir, we note that expression (4.4), when T is any 

Pl’T 

positive integer, is always less than or equal to expression (4.5) for any value 
of T' when T < T' < T + 1. Whenever T is not an integer (4.5) is greater 
than or equal to (4.4). Thus we need only maximize (4.5) over pi and T. 

Since v 0 L lTi (p 0 ) can easily be shown to be non-increasing in [T], it can easily 
be argued that the maximum of (4.5) over pi and T will always occur for [T] 
as small as possible. If T < 1 the maximum is trivially equal to I. To avoid this 
degenerate case we shall bound T away from 1, i.e., we assume 1 < T < 2. 
We want to obtain 


(4.7) 


max ir T = max 

Pl,T>l Pi 


VoL(po) 

1 - L(pi) 


(pi 


— Po) + po 


= max 

Pi 


L(po)(pi — po) 

1 + L(po) - Lfa) 


+ Po. 


Differentiating the appropriate function in (4.7) with respect to pi we get 


(4.8) 


J L(po)(pi - po) l 
11 + L(po) — L(pi)f 

dpi . 

[1 + L(po) - L(pi)lL (po) + L(po)(pi - po)L (pi) 

[1 + L(po) - T(pi)] 2 

_ L(po) {1 + L(po) ~ LipO + (pi — Pa)L (pi)} 

[1 + L(p 0 ) - i(pi)j 2 ' 


The expression (4.8) (assuming L monotone) is positive for Pi = Po ; hence 
the maximizing value pi of (4.7) is either at one of the roots 

(4.9) 1 + L(Po) ~ L(pi) + (pi — Po)T'(pi) = 0 
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or at Pl = 1. The latter will always be the case if there are no roots to (4.9). 
iipi — 1 then (4.7) becomes 

(4.10) L(pa)(l — p 0 ) _ P o + L(po) 

1 + L(po) + P °~ T+L(poj- 
It p < 1, then from (4.9) we have that 

(fi ~ Po)L'(pi) = -[1 + L(p 0 ) - L( Pl )] 

and (4.7) becomes 

= -pf^V + P°- 

L'(p i) * 

Thus given a value p 0 and a sampling plan with OC curve L(p) it is possible 
to compute a very conservative estimate of the proportion defective on hand. 
For the purpose of selecting plans it might now be profitable to tabulate this 
value of x r for some existing sampling plans and for various values of p 0 . 

5. Case 2: Exponential Deterioration 

. 111 ^uation we suppose p{t) = 1 - fc-'fo 6 > 0). Then 1 - e = v(0) 

tion^f tlT 1 ~ v ~~ = 1 - e + p{0)e~ at . Note that a is a func¬ 

tion ot the samphng period. We also assume that Lj{p) = (1 - v ) k - an OC 

CU *7® ,t nSmg a single sampling plan with acceptance number (C) zero 
and k the size of the sample. This, to be sure, is a very special case. However 

re^rT 1 ;^ 18 obtamed 1 in this case can serve as an indication of the 
results m situations less amenable to calculation. At any rate we have 

7=1 


Vi = v, II [1 - p(r + 1)]* 


r=0 

2-1 


= v 0 II d k e~ a{r+l)k 


r= 0 


(5.1) 


= v 0 d kl 


exp 




= vod k ’e~ (akn]i(i+1) 


v 0 = 


We then have 


(5.2) 


1 + D e ki e~ (akmKi+1) £ 


;=l 


7=0 


* - § *(1 - Se~ a,M ) = 1 - ee— £ Vie-*! 
1 y«o 

= 1 — 6e~ aT v 0 d k3 e~ (akl2)jU+1) e ~ aj 


= 1 - Ovoe^ J2 

7=0 
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Since in any physical situation, a and t will be given, p(t) will be determined 
in a specific time dimension (i.e., seconds, weeks, etc.). We can set at = a*r, 
determine a* according to some reasonable criterion, and then obtain r. 

By numerical methods a can be found to minimize w r . Then the length of the 
period between inspections can be adjusted so that the deterioration function 
will have that particular value of a. The optimal interval between inspections 
is a function of the particular sampling plan used, i.e., we considered the plan 
with OC curve (1 — pf. It would be of interest to see how much the optimal 
a varies with a change in sampling plan. 

6. The Truncated Model 

In some situations, a surveillance policy might dictate that a lot, after having 
been on hand for a certain length of time, must be automatically replaced by a 
new lot. Thus if a lot is replaced after M + 1 periods we have in the Markov 
chain that Pm ,o == 1- With this modification the “steady-state probabilities 
are 

j—i y~i 

(6.1) Vj = Vo II Pr,T+l = Vo n Lt +1 [p( r + 1)] 0 = 1 ; • • • ; M) 

t—Q r==0 

_1_ 

Va = 1 + X) n Lr +1 [p(r + 1)1 ' 

2=1 r-0 

The analogue of w T is 

M 

(6.2) *v.j i = 23 ^ p(j + r) . 

3=0 

It is of interest to carry out computations to determine the effect of truncation 
on the level of quality maintained. This has been tried for several values of the 
parameters, where p(t) is exponential and a = 1, and some graphs have been 
drawn to depict the relationships. The graphs in Figures 1 through 3 are not 
expected to portray any realistic situation but merely to demonstrate the ef¬ 
fect of truncation on several sampling plans. In Figure 1 each lot is replaced 
after two periods (M = 1), 6 = .99, and a sample size of 5 is considered with 
acceptance numbers C (number of defectives tolerated for acceptance) as de¬ 
picted. Figure 2 considers exactly the same situation for M = 2, and Figure 3 
for M = 3. 


7. Selection of a Surveillance Plan 

Discussion and analysis in the previous sections are not extensive enough to 
make possible the selection of an optimal surveillance sampling plan or set of 
sampling plans for some specific purpose. However, the expressions for vq and 
ir r make it possible to choose the better of two offered plans or the best among 
a finite number of offered plans. For example, suppose we desire the proportion 
of defectives on hand to be never greater than w*. Then among the possible 
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plans select those which maintain t t £ tt* for all r. Now v 0 indicates the pro¬ 
portion of all the lots that are new. Since there is some expense incurred in ob¬ 
taining a new lot, one would desire to select from among all plans where 

7T r g TT* 

that plan having the smallest vq . 

Naturally much work remains before consideration can be given to a standard 
catalogue of surveillance plans but the thinking in this report should be helpful 
as a first step towards this goal. For example, in the case of a general deteriora¬ 
tion function p(t) } the actual computation of w r and vq could be quite difficult. 
The use of the step-function model discussed in Section 4 only serves as a first 
approximation and the use of the specific exponential form discussed in Section 
5 could be quite unrealistic. It would be interesting and useful to determine 
how well the step-function model serves as a first approximation for various 
deterioration functions. 

[The original paper contained nine graphs illustrating the effects of parameter 
variations. Only three of those graphs are reproduced here. —The Editor.] 
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Fig. 1 































































Fig. 2 
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Fig. 3 





























